Monday, March 20, 2006 6:01 AM
Michael S. Kaplan
Pretending the vowels aren't there
The other day, Suzanne was talking about Hebrew searches (it was good that she finally posted, or else people might have started talking about Suzanne searches?).
So let us look at the issue a bit. The problem is that the two strings:
ארץ־ישראל
and
אֶרֶץ־יִשְׂרָאֵל
are at some level the same string and in some cases it is entirely reasonable to want a search to find them both.
So, let's look at the sort keys for them, with vowels:
12 02 12 1a 12 17 07 6b 12 0b 12 1b 12 1a 12 02 12 0e 01 2a 2a 02 02 28 57 2c 29 01 01 01 00
and without them:
12 02 12 1a 12 17 07 6b 12 0b 12 1b 12 1a 12 02 12 0e 01 01 01 01 00
and now (wait for it!) with vowels but while including the NORM_IGNORENONSPACE flag to remove the secondary weights:
12 02 12 1a 12 17 07 6b 12 0b 12 1b 12 1a 12 02 12 0e 01 01 01 01 00
Ah, there is our answer -- if we have a search capability to optionally ignore the secondary weights, it will find both strings.
Note that these same principles apply to the trope marks that are used to indicate how to chant in Hebrew -- they get ignored via the same flag.
(You can ignore symbols to get rid of that hyphen looking thing in there, the Maqaf!)
This does not exist in Google or Live.com, and although it appears to exist in Word, it does not work:

Ah well, something everyone can work on for the future! :-)
This post brought to you by "ל" (U+05dc, a.k.a. HEBREW LETTER LAMED)