Word and String Searches
Searches may be performed by context word or string and may be combined with parameters for date (Century or Year), Location, or Source text, or any combination thereof. Searches will only produce results if diacritics are used in the word/string queried, either by using the drop-down auto-fill selection provided for Word searches or by typing in the word(s) using accents/diacritics. For help typing accents/diacritics, see the Using Accents Help Guide. Word searches may NOT be run in combination with a Headword search. Do not include punctuation, such as commas or periods, in a Word or String search.
To search for a context word (that is, a word that appears anywhere in any of the citations in the LHA corpus), the user should use the Word and String search boxes. It is important to remember that Headword searches provide results for words that are identified as contextually important in the citation and defined as a headword (see Headword Search Help Guide for more information). Word and String searches look for any word that appears in any citation. Additionally, since word and string searches (as well as other searches, for instance by source or location) involve culling citations from all headwords, and since some citations (or nearly identical citations) were used to illustrate more than one headword, searches may produce partially identical citations. These are easily identifiable from the source information that accompanies them. This is explained under the Headword Search Help Guide and is a function of the original LHA project as published on CD-ROM.
The Word search option will find a whole word, with the exact orthography, in singular or plural, masculine or feminine, as typed in the search box, or selected from the drop-down auto-fill list. The auto-fill list provides all words found in the corpus. As the user begins to type in the word, suggested options begin to appear, up to ten at a time, in a scrolling list. If the user finishes typing the word he/she wishes to search and it does not appear in the drop-down menu, it is not found in the corpus. If the user wishes to search on a regularized word form, he/she may consider a Headword search (again, see Headword Search Help Guide for more information on headwords).
Within the Word search option, the user may also use wildcard options, to look for words with variant orthography. Wildcards create search options for single or multiple variant characters (see Using Wildcards Help Guide for more information).
For example, to search for all forms of the word caballo (cauallo, cavallo, caballo), in all citations, regardless if it is the headword or not, the user could use the single-character wildcard “_” and search on ca_allo. All instances of caballo, cauallo, cavallo in the singular, would be provided. To search on the plural Word form, the user could input ca_allo_.
For a comparison between Word and Headword search results, the Word search on ca_allo produces 1,490 instances of the forms, and the search on ca_allos produces another 977 in the plural, while a Headword search on caballo (which includes orthographic variation and singular and plural) produces 164 instances of caballo as a headword. As described in the Headword Search guide, Word searches provide a complete resource for all appearances of the word in the LHA corpus.
In Word searches only, for ease of recognition, the word queried is highlighted in bold font in the search results. It should be noted, however, that if the word is found within parentheses, and thus an editorial addition, as described in the Citation Help Guide, the result is not highlighted, since it is not a context word that appears in the original text source.
The String search option looks for a group of characters within a word or string. The user can employ wildcards in String searches (again, see Using Wildcards Help Guide for more information), for single or multiple character variation. For example, to search on verb forms containing the string tuv, the user could use the String search to find that string in any word, which would provide results for tuvo, tuviesen, estuviese, estuvo, detuvo, among many others. One could refine the search for forms of the verb tener by placing a space before tuv [“ tuv”], which will cause the program to look for word-initial forms of tuv: tuvo, tuve, tuviese, etc. Or the user could search on “ tu_i”, with a space before the search characters, for forms of tuviesse, tuviese, tubiesse, tuviere, etc.
To use the String search to find locutions, examples might be:
- Search for sin par, typing in “ sin par ”, with spaces before sin and after par, to provide word-initial sin and word-final par.
- Search on " a ca_allo ", with space before and after the words, for uses of a caballo.
- Or a osadas, can be found by simply typing in the words without spaces or wildcards, remembering that to find cases where the locution appears as one word, a separate search would need to be run on aosadas.
The user should remember that String searches looks for character groupings, so the query identifies items positioned with any number of characters before or after them. For example, if to find all examples of aosadas / a osadas the user searched on osadas, results would include: rosadas, glosadas, posadas, etc. So using a space to indicate word initial and final position, when possible, and searching for as many sequential characters as possible, will provide more accurate search results. Yet, going back to the tener examples above, if the user searched on “ tu_” (with space before “tu_” for word-initial position), expecting to find forms of verb tener, the program would bring up many more results than just verb forms. Since the program is searching for that string in any context, with any number of characters following the single-character wildcard, results will produce items such as turquesados and tunas. Again, refining the search to include more sequential characters, such “ tu_o” and “ tu_ie” will provide better results.
The general search “ tu_” can still be useful, however, even if words such as tunas appear in the list. The user can use the search filter function and search the results on the screen for tuvo or whatever form is the object of the search. (See the help guide on Filtering and Exporting Search Data for more information.)
Then, using the % wildcard for zero or any number of variant characters between characters or words in the String search can provide some flexibility for proximity searches.
For example, “labio%rojo” will bring up all the instances of labio(s) followed by rojo(s), with zero through an infinite number of characters between the two strings. Results for that search are:
Citation: [1867 Colombia] sus labios rojos, húmedos y graciosamente imperativos [IMR 14]
Citation: [1961 México] se limpió los labios y manchó la servilleta de rojo [FMA 23]
Citation: [c. 1966 Cuba] labios pintados de rojo escarlata [CIT 285]
Again, the more consecutive characters the user can provide in the string search, the more limited and accurate results the program will produce.
The user should realize that a simple search on a word or string may produce a significant number of returns. For example, a Word search on casa will provide over 3,523 results. A String search ending with “ado” will produce 31,926 results. The user may wish to think carefully on the parameters used, especially with common or high-frequency words, and limit the search before beginning. If a query does produce a large number of results, it will pause after 500 results and prompt the user to decide if he/she wishes the program to continue or abort the current process. In such instances, the user may wish to abort the search and limit it by either number (see Limit Search Help Guide), Location, Source, or date (Century or Year) to provide a more manageable selection.
Search results are displayed in the following format, in chronological order:
- First, in brackets, the date and place of the source text cited.
- Second, the text citation as given in the Boyd-Bowman Corpus (see Citation Help Guide for a description of how citations are presented)
- Third, in brackets, the 3-character abbreviation for the text title, followed typically by the pages within the text where the citation is found (for complete listing of text abbreviations, see the Source Abbreviation and Text Title link on the Help tab or the link next to the Source search box on the home page). If there is an associated volume number for the text source, that number will precede, if relevant, the page number.
For example:
[c. 1575 México] sería adúltera y moriría estruxada la cabeça entre dos piedras [BSG 4, 5]
This is the original information from the Léxico hispanoamericano published on CD-ROM.[1] Copyright on all data is held by the Hispanic Seminary of Medieval Studies.
[1]Peter Boyd-Bowman’s Léxico hispanoamericano 1493-1993. Eds. Ray Harris-Northall and John J. Nitti. Technical development by Jean E. Lentz. New York: Hispanic Seminary of Medieval Studies, 2003-2007. Version 2.0. April 2007. For those interested in the construction and history of the Boyd-Bowman project, Léxico hispanoamericano, there is information under the History of the Project link under the About tab on this website.