Why that is positively Ethiopic!

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

Why that is positively Ethiopic!

  • Comments 26

A little over a week ago, when I was mentioning that In Tamil -- sometimes, they are digits; other times, just numbers, Scott Hanselman suggested "That would ROCK if you would do Ethiopic sometime." Well, rock on Scott -- today is the day.

For the record I am not an expert in these things, just a geek who finds alternate number systems to be really interesting (whether roman numbers, Tamil numbers, or Ethiopic numbers).

Ready? here we go....

Factoid -- there is no Ethiopic zero. There are some numbers that have zeros in them (10, 20, 30, etc.) but no zero. It makes the number system quite fascinating.

We'll start with a small quote from the Unicode Standard on the subject, found in Chapter 12, Section 1 (available for viewing online in PDF format, here):

Numbers. Ethiopic digit glyphs are derived from the Greek alphabet, possibly borrowed from Coptic letterforms. In modern use, European digits are often used. The Ethiopic number system does not use a zero, nor is it based on digital-positional notation. A number is denoted as a sequence of powers of 100, each preceded by a coefficient (2 through 99). In each term of the series, the power 100^n is indicated by n HUNDRED characters (merged to a digraph when n = 2). The coefficient is indicated by a tens digit and a ones digit, either of which is absent if its value is zero.

For example, the number 2345 is represented by

2,345 = (20 + 3)*100^1 + (40 + 5)*100^0
      = 20 3 100 40 5
      = TWENTY THREE HUNDRED FORTY FIVE
      = 1373 136b 137b 1375 136d 
      = ፳፫፻፵፭

If you are like me then your eyes may have crossed when you read this, even though the example seemed clear enough. Maybe they should have put in a bigger example....

Personally, I find Daniel's Ethiopic Number Algorithm #4 to be much clearer from a conceptual standpoint. If you prefer something a bit more cerebral with code samples, then you can look at http://www.geez.org/Numerals/ for a slightly different algorithm (using the same number, I suspect a shared source, maybe? <grin>). The page even has links to demonstrations of the algorithm in Perl, C, Java, and C#.

So let us take the resulting number that both sites talk about (፯፻፷፭፼፵፫፻፳፩) and try to convert it back from Ethiopic to our familiar Arabic-Indic digits:

= ፯፻፷፭፼፵፫፻፳፩

= 136f 137b 1377 136d 137c 1375 136b 137b 1373 1369

= DIGIT SEVEN; NUMBER HUNDRED; NUMBER SIXTY; DIGIT FIVE; NUMBER TEN THOUSAND; NUMBER FORTY; DIGIT THREE; NUMBER HUNDRED; NUMBER TWENTY; DIGIT ONE

(I removed the word ETHIOPIC from each character name to allow more to fit per line)

At this point, even knowing what the number is, the words on the site ("Conversion from Ethiopic numerals into western form is trivial") do not seem quite as true, do they? :-)

Though it actually is easy, it just looks hard. Keeping in mind those "sentinels" that ETHIOPIC NUMBER HUNDRED and ETHIOPIC NUMBER TEN THOUSAND represent (with two digits in each group, between them) and we have:

= DIGIT SEVEN; NUMBER HUNDRED;
      NUMBER SIXTY; DIGIT FIVE; NUMBER TEN THOUSAND;
      NUMBER FORTY; DIGIT THREE; NUMBER HUNDRED;
      NUMBER TWENTY; DIGIT ONE

Notice how the sentinels keep swapping between the TEN THOUSAND and the HUNDRED? Interesting...

Picking at the pieces:


      65 
      43 
      21

or more conventionally

7654321

Not too hard, right? Lets try another one:

= ፳፩፼፳፰፻፷፯፼፶፫፻፱

= 1373 1369 137c 1373 1370 137b 1377 136f 137c 1376 136b 137b 1371

= NUMBER TWENTY; DIGIT ONE; NUMBER TEN THOUSAND; NUMBER TWENTY; DIGIT EIGHT; NUMBER HUNDRED; NUMBER SIXTY; DIGIT SEVEN; NUMBER TEN THOUSAND; NUMBER FIFTY; DIGIT THREE; NUMBER HUNDRED; DIGIT NINE

A little harder this time, but lets do the grouping where those grouping sentinels are and see what we have:

= NUMBER TWENTY; DIGIT ONE; NUMBER TEN THOUSAND;
    NUMBER TWENTY; DIGIT EIGHT; NUMBER HUNDRED;
    NUMBER SIXTY; DIGIT SEVEN; NUMBER TEN THOUSAND;
    NUMBER FIFTY; DIGIT THREE; NUMBER HUNDRED;
    DIGIT NINE

We seem to be missing a digit right before that nine -- what happened to two numbers in each group? Ah, thats easy -- look at the sentinel! A zero goes there. So we have:

= 21 
    28 
    67 
    53 
    09

And as Tommy Tutone knows, Jenny's New York phone number is indeed 212-867-5309.

Ok, one more that shows a bit more of that missing zero stuff:

= ፶፻፭፼፭

= 1376 137b 136d 137c 136d

= NUMBER FIFTY; NUMBER HUNDRED; DIGIT FIVE; NUMBER TEN THOUSAND; DIGIT FIVE

Ooh, a tough one. I'll insert some fake zeros in where they seem to belong based on those sentinels:

= NUMBER FIFTY; NUMBER HUNDRED; 
    DIGIT ZERO; DIGIT FIVE; NUMBER TEN THOUSAND;
    DIGIT ZERO; DIGIT ZERO; NUMBER HUNDRED;
    DIGIT ZERO; DIGIT FIVE

So we have:

= 50 
    05
    00
    05

Or more conventionally 50,050,005.

Now of course I am not saying that you would write code that is quite this silly. But it is reasonably straightforward to write an algorithm that can handle these numbers. A bit more background required than I would try to give for an interview question (though someone who could understand it in such a short time and come up with a good answer might have impressed me).

Anyone want to take a stab at it? :-)

Side note #1 -- the Unicode Technical Committee voted in UTC#98 to change the general category of the ETHIOPIC DIGITS from Nd (Number, Digit) to No (Number, Other) due in large part to the fact that the Ethiopic numbers are not generally used as digits. This change was effective as of Unicode 4.01. As such, the update will not be seen in Windows until Longhorn or in the .NET Framework until the version after Whidbey.

Side Note #2 -- Ethiopic is in the category of scripts I defined in The jury will give this string no weight (a fact that will not be changing until coincidentally around the same time -- Longhorn and the .NET Framework in the version after Whidbey).

 

This post brought to you by "፼" (U+137c, a.k.a. ETHIOPIC NUMBER TEN THOUSAND)

Comment on the blather
Leave a Comment
  • Please add 6 and 8 and type the answer here:
  • Post
Blog - Comment List
  • How do you represent a number with four or more consecutive zeros? Say, 1000001 or 10000000001. Do you alternate HUNDRED and TEN THOUSAND characters, with nothing between them, for each group of two zeros?
  • Well, here are a few more to show the pattern:

    1000001 ፻፼፩
    10000000001 ፻፼፼፩
    100000000000001 ፻፼፼፼፩

    :-)
  • 1000001 => ፻፼፩
    10000000001 => ፻፼፼፩

    See:

    http://geez.org/Numerals/NumberSamples.html

    sample numeral conversions from the sources.
  • Or you can do what I did.... I took the C# source and compiled it. The code creates that NumberSamples.html file, and you can add to it whatever numbers you like. :-)
  • Darn, I figured thousands of page views would find one person who wanted to give it a shot. :-)
  • I never understood how they represent the answer to "5 - 5" in these numbers systems... I guess they hadn't invented subtraction when they came up with it. :p~

    Besides, it's really only mathematicians who care about "0" - you don't really see it in every day life, do you?
  • Well, I see zero all the time.... and so do Swedes (its on every elevator).
  • Sure, but if we didn't have a zero to begin with, then they'd probably put "G" on there (or whatever the first letter of the Swedish word for "ground" is).

    I guess my point was that if we hadn't "invented" zero so that subtraction could be properly defined, then we'd probably never need one (e.g. instead of saying "$0 deposit" in an ad for a car, you'd say "no deposit" or whatever.)

    Mind you, I only really started thinking about this when I posted my first post, so maybe there *are* plenty of reasons for a zero outside of mathematics - I just can't think of one right now (that is, where you can't replace the zero by something equally meaningful...)

    As an example, my local IP address is 10.0.0.45. Those two zeros could just be left blank and you'd have "10. . .45" which is equally unambiguous.

    Anyway, the Ethipians, Romans and Sri Lankins seemed to get along fine without them. Maybe my point is irrelevent, I dunno, but it's interesting nonetheless.
  • Definitely interesting -- the whole area fascinates me. :-)

    Though it was too bad no one decided to code up the Ethopian to Arabic-Indic solution....
  • Okay, all I know about this numeric system I have from this article and the linked algorithm, but don't you leave out some "power characters"?
    Like in "7654321" shouldn't it be "DIGIT SEVEN; NUMBER HUNDRED; *NUMBER TEN THOUSAND;*" etc.?
  • Just another reason I love reading your blog - there's always something that gets me thinking about things I would have never considered before. I mean, why would it have otherwise occured to me that you can get along with a character for zero anyway?

    OK, you've convinced me: I'm at work now (different timezones and all) but when I get home, I'll see if I can't write a little Ethiopian to Arabic-Indic converter :)
  • Well, lets see -- the number would be ፯፻፷፭፼፵፫፻፳፩.

    Thats:

    DIGIT SEVEN; NUMBER HUNDRED;
    NUMBER SIXTY; DIGIT FIVE; NUMBER TEN THOUSAND;
    NUMBER FORTY; DIGIT THREE; NUMBER HUNDRED;
    NUMBER TWENTY; DIGIT ONE

    So you started ok, but you forgot the two numbers between the HUNDRED and the TEN THOUSAND....

    :-)
  • No, I mean if you look at the linked "algorithm 4" it should be
    DIGIT SEVEN; NUMBER HUNDRED; NUMBER TEN THOUSAND;
    NUMBER SIXTY; DIGIT FIVE; NUMBER TEN THOUSAND;
    NUMBER FORTY; DIGIT THREE; NUMBER HUNDRED;
    NUMBER TWENTY; DIGIT ONE

    The seven is 7*10^6 or 7*10^(2+4) after all.
  • OK, I took up your challenge :)

    You can see a screenshot of my app here: http://www.codeka.com/tmp/ethiopian.png

    And you can download the C# source + binary here: http://www.codeka.com/tmp/ethiopian.zip

    Simply enter an arabic-indic number in the bottom text box, click convert and it'll output an ethiopic number (the code for that is "borrowed" from that web site). If you then cut'n'paste that ethiopic number into the top text box and click that convert button, it'll convert it back to the familiar arabic-indic form. (That's the code I wrote).

    I didn't write any automated tests, but you can manually test by typing a number into the arabic-indic box and clicking "Test" - this'll convert to Ethiopic then back again - you'll have to eyeball the result to make sure it's right.

    It's probably not that lenient with respect to invalid ethiopic numbers, but it works OK for normalized numbers.

    The code for doing the conversion is not that hard, but I won't bother explaining it here, since you should just be able to look at and see (probably better to follow it through with the debugger, it's not commented very well, heh)
Page 1 of 2 (26 items) 12