Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
In the newsgroups, Yi Zhang asked:
Hi all I've encountered the following situation in my work: a customer of my company reported that he can't read a file created in German user locale correctly from a English user locale. This is because German locale uses ',' as the decimal symbol while English uses '.' I tried to experiment with .csv files in Microsoft excel, and it seems that excel will always use current user locale to parse the .csv files, which could end up with incorrect result if the .csv file is created in a different user locale. So this seems to suggest that user should be responsible for setting the correct user locale before he tries to open a file. But this might be too much for the user. Is there a better way to do this? Or user should always be responsible for setting the correct user locale? Thanks in advance Yi Zhang
Hi all
I've encountered the following situation in my work: a customer of my company reported that he can't read a file created in German user locale correctly from a English user locale. This is because German locale uses ',' as the decimal symbol while English uses '.' I tried to experiment with .csv files in Microsoft excel, and it seems that excel will always use current user locale to parse the .csv files, which could end up with incorrect result if the .csv file is created in a different user locale. So this seems to suggest that user should be responsible for setting the correct user locale before he tries to open a file. But this might be too much for the user. Is there a better way to do this? Or user should always be responsible for setting the correct user locale?
Thanks in advance
Yi Zhang
I had vaguely remembered people mentioning something like this before, but I had always been using the Access wizard since it does a better job here.
Anyway, I decided to look in Excel's help to see if it could provide any assistance. Here is what I found in the Excel 2003 help:
You can change the separator character used in both delimited text files and comma separated values (CSV) text files. Change the delimiter in a delimited text file For a delimited text file, you can change the delimiter from a Tab character to another character in the second page of the Text Import Wizard. From the same wizard page, you can also change the way consecutive delimiters, such as consecutive quotes, are handled. Change the separator in a CSV text file Click the Windows Start menu. Click Control Panel. Open the Regional and Language Options dialog box. Click the Regional Options Tab. Click Customize. Type a new separator in the List separator box. Click OK twice. Note After you change the list separator character for your machine, all applications will use the new character. You can change the character back to the original character by using the same procedure.
You can change the separator character used in both delimited text files and comma separated values (CSV) text files.
Change the delimiter in a delimited text file
For a delimited text file, you can change the delimiter from a Tab character to another character in the second page of the Text Import Wizard. From the same wizard page, you can also change the way consecutive delimiters, such as consecutive quotes, are handled.
Change the separator in a CSV text file
Click the Windows Start menu. Click Control Panel. Open the Regional and Language Options dialog box. Click the Regional Options Tab. Click Customize. Type a new separator in the List separator box. Click OK twice. Note After you change the list separator character for your machine, all applications will use the new character. You can change the character back to the original character by using the same procedure.
Note After you change the list separator character for your machine, all applications will use the new character. You can change the character back to the original character by using the same procedure.
Hmmmm. Well, I guess this is not quite as bad as instructions for drivers that actually show a picture of the "This driver is not signed..." dialog as part of the instructions to install their driver, but it is about the same order of magnitude. Also it is a little obnoxious to instruct people to change the settings that will affect all applications so that an individual file that may have been sent by someone else with different settings can be imported....
I guess it is why I prefer the Access import text wizard, which lets you specify a different separator (after trying to initially guess at the one to use by looking at the actual file!)
Irregardless, Mihai responded to the post where I put the info from help:
Sorry MichKa, but this is really bad. It is not your fault, I know, it is Excel (or the Excel team)! The help teaches one to mess-up the settings for all application because Excel does not understand some basic concepts: - The 10 years old bullet in Nadine Kano's book "All language editions can read one another's documents" - CSV (Comma Separated Values), means COMMA separated, not "separated by the locale-dependent list separator" ================= Yi Zhang: my advice would be to try some formats other than CSV. I have done some preliminary tests, and Excel 2003 seem to work ok with XML, Tab delimited text, and Unicode Text (which is also tab delimited). I would vote with XML or Unicode Text (using Unicode, so you will have no problems with code pages and lost characters).
Sorry MichKa, but this is really bad. It is not your fault, I know, it is Excel (or the Excel team)!
The help teaches one to mess-up the settings for all application because Excel does not understand some basic concepts:
- The 10 years old bullet in Nadine Kano's book "All language editions can read one another's documents"
- CSV (Comma Separated Values), means COMMA separated, not "separated by the locale-dependent list separator"
=================
Yi Zhang: my advice would be to try some formats other than CSV. I have done some preliminary tests, and Excel 2003 seem to work ok with XML, Tab delimited text, and Unicode Text (which is also tab delimited). I would vote with XML or Unicode Text (using Unicode, so you will have no problems with code pages and lost characters).
I admit that Mihai has a point here -- one only has but to read the name. But short of that (and to retain backward compatibility), I could live with a configurable setting, so that one can change the behavior for the one import without affecting everything else the user runs. And ideally something that will look at the data and make an intelligent guess would be a good thing....
The thread ended with some input from Louis Solomon:
try open office ...
Now Louis included no information on what Open Office does to make this situation different, for better or worse. So I honestly cannot say what the purpose of his three word response was. I decided to search for his name connected to Open Office in newgroups to see if he had perhaps said more about it somewhere else. But the only one I found was:
Louis Solomon [SteelBytes] wrote: > The only thing that I can see so far that stops me switching to OpenOffice > (from MS Office), is lack of English UK (Australian in particular) support.
Not entirely relevant, but it makes for an interesting ounterpoint if he is not using the product (he may be now, but if so he has not been posting on it much under that name)
Anyway, does anyone know if the recommendation can be salvaged? Does Open Office have a particular feature in this area that is easier and less involved than just changing the file extension to .TXT from .CSV?
Wherever John Bonham is, I doubt he'd be very happy about this. Regular readers may remember my post
Thanks! This was of great help to me.
In Portuguese locale, dot is for grouping and comma for decimals. If I want to export some performance counter log into CSV or tab delimited I won't be able to open it in Excel because it won't get the format correctly. The performance counters CSV files are generated the same way, no matter the locale is...