Sign In
Funny, It Worked Last Time
... and other odd mutterings of a performance junkie
Translate This Page
Translate this page
Powered by
Microsoft® Translator
Options
Email Blog Author
RSS for posts
Atom
RSS for comments
OK
Search
Advanced search options...
Search In:
Everything
Blogs
Forums
People
Groups
Places
Pages
Date range:
All Time
Last Year
Last 6 Months
Last 3 Months
Last Month
Last Week
Last Two Days
Tags
C++
I18N
Performance
Archive
Archives
June 2005
(1)
May 2005
(2)
January 2005
(2)
November 2004
(1)
October 2004
(5)
October, 2004
MSDN Blogs
>
Funny, It Worked Last Time
>
October, 2004
Posts
Subscribe via RSS
Sort by:
Most Recent
|
Most Views
|
Most Comments
Excerpt View
|
Full Post View
Funny, It Worked Last Time
Encodings in Strings are Evil Things (Part 5)
Posted
over 8 years ago
by
ryanmy
6
Comments
However, regardless of whether pre-composed characters are favored or not, there are some character sequences which do not have pre-composed equivalents and must be represented using combining characters. Of course, our problem here is that most programmers don't think about accents as being distinct elements to iterate through! When you hit the right arrow in Microsoft Word to skip over an À, you don't go first to an A and then to the A's accent -- you move past the whole "character." (Unico...
Funny, It Worked Last Time
Encodings in Strings are Evil Things (Part 4)
Posted
over 8 years ago
by
ryanmy
2
Comments
In our last episode, we established that we wouldn't be able to make a true std::string replacement and still handle variable-width encodings. So, we started with the beginning lines of an rmstring class. However, this doesn't mean we are going to dispense with std::string entirely! And, as it turns out, compatibility with it is both easier and harder than actually making a std::string, depending on what you're implementing and where......
Funny, It Worked Last Time
Encodings in Strings are Evil Things (Part 3)
Posted
over 8 years ago
by
ryanmy
1
Comments
Yesterday, we took the definition of string as an ordered sequence of Unicode code points, and explored various schemes for encoding and decoding code point indices on a binary computer. At the end, we had a new definition for string -- a stream of bits, and some type of information identifying the encoding scheme used to interpret the bits as a stream of Unicode codepoints. Today, since I'm a coder, we'll be starting a C++ implementation of a string library based on this definition....
Funny, It Worked Last Time
Encodings in Strings are Evil Things (Part 2)
Posted
over 8 years ago
by
ryanmy
5
Comments
At the end of the last post, we reduced the abstract concept of "string" down to an "ordered sequence of Unicode code points." (We did so by choosing to actively ignore glyph information, but we'll be coming back to it later.) Unicode code points are simply numbers; of course, numbers have to be reduced to binary to be stored in a computer. Someone who is reading a string needs to use the exact same encoding scheme. And not all encoding schemes are equal......
Funny, It Worked Last Time
Encodings In Strings Are Evil Things (Part 1)
Posted
over 8 years ago
by
ryanmy
4
Comments
What is a string? About six months ago at the Game Developers Conference in San Jose, I sat in on a talk about performance tuning in Xbox games. The presenter had a slide that read: "Programmers love strings. Love hurts." This was shown while he described a game which was using a string identifier for every object in the game world and hashing on them, and was incurring a huge performance hit from thousands of strcmp()s each frame. I nodded -- but my mind was thinking......
Page 1 of 1 (5 items)