27 February 2008

Unable to correctly display Chinese (Unicode) characters in Excel when opened through ASP.Net page

Recently I was working on an issue wherein one of our customers was trying to stream data from their web application in CSV format for it to be recognizable and opened through Excel on the client's end. Basically they were setting content-type and content-disposition to open the file outside the browser and open it in MS-Excel. Everything would have worked had they not used Chinese characters as data in this case.

Something like this:

Page.Response.Clear()
Page.Response.ContentType = "application/vnd.ms-excel"
Page.Response.ContentEncoding = System.Text.Encoding.UTF8
Page.Response.AddHeader("Content-Disposition", "attachment; filename=ExportData.xls")

And later in the code they were reading the column headers and column row in CSV format into a string which will get flushed as a response output.

Something like this:

'Output Column Headers as 

        columnHeaders = "HEADER1" + Chr(9) + "HEADER2"             
        columnHeaders = columnHeaders & Chr(13) & Chr(10)

[Here, Chr(9), Chr(10) and Chr(13) correspond to Tab, Linefeed and Carriage Return characters in ASCII respectively to adhere to CSV format]

Page.Response.Write(columnHeaders)
Page.Response.Write(Chr(10))

and

'Output Column Row as
columnRow = ""

After populating the columns in various strings we do this to adhere to CSV format:

columnRow = coulmn1 + Chr(9) + column2

columnRow = columnRow & Chr(13) & Chr(10)

...............

Page.Response.Write(columnRow)        ' Finally display the data

Now if you see above this should work if we try to open the file using Excel. Although if we are sending the data in UTF-8 encoding (let's say for Chinese characters), Excel doesn't recognize it correctly and opens it in ASCII. In normal scenarios the above functionality will not cause issues but if we are using any Unicode characters like Chinese the data will be wrongly displayed in Excel. You may see "???????" etc. Although it may display perfectly fine in the webpage control , let's say in a datagrid.

The resolution to such an issue is to switch from UTF-8 to Unicode and add Unicode byte leader
to the start of the file. Excel will recognize the byte-leader as an indication of Unicode data coming in, and correctly read the file as Unicode. This way Unicode characters like Chinese can be preserved when opened through Excel.

Here is something you can try:

Dim rgByteLeader(1) As Byte
        rgByteLeader(0) = &HFF
        rgByteLeader(1) = &HFE

        Page.Response.Clear()
        Page.Response.ContentType = "application/vnd.ms-excel"
        Page.Response.ContentEncoding = System.Text.Encoding.Unicode
        Page.Response.AddHeader("Content-Disposition", "attachment; filename=ExportData.xls")

' Write out the Unicode header FFFE so that Excel recognizes the file as Unicode()
        Page.Response.BinaryWrite(rgByteLeader)

'Output Column Headers as before

        columnHeaders = "HEADER1" + Chr(9) + "HEADER2"             
        columnHeaders = columnHeaders & Chr(13) & Chr(10)

        Page.Response.Write(columnHeaders)
        Page.Response.Write(Chr(10))

 

'Output Column Rows as before
        columnRow = ""

        .............

       columnRow = coulmn1 + Chr(9) + column2
       columnRow = columnRow & Chr(13) & Chr(10)

       Page.Response.Write(columnRow)

       .....

       Page.Response.End()

 

I am no Globalization/MS-Excel expert but I had a tough time researching on this issue so thought of sharing it with others. Hope this helps!

 

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# BioSensorAB » Unable to correctly display Chinese (Unicode) characters in Excel when opened through ASP.Net page said:

PingBack from http://www.biosensorab.org/2008/02/26/unable-to-correctly-display-chinese-unicode-characters-in-excel-when-opened-through-aspnet-page/

26 February 08 at 8:59 PM
# Trevor said:

Thankyou! I'd hit a brick wall trying to get UTF-8 encoded comma-separated CSV files to automatically import correctly (with or without the BOM).

Using UTF-16LE encoding with the BOM (FF FE) and a tab as the separator everything is dandy.  Albeit that this is now more of a TSV..

05 March 08 at 6:47 AM
# Jannize said:

Thank you very much. I had the same problem and your solution help me to solve it.

26 June 08 at 3:33 AM
# Vicki said:

Thanks!

and there is another way...

//get the UTF-8 BOM

byte[] bom = System.Text.Encoding.UTF8.GetPreamble();

Response.BinaryWrite(bom);

29 October 08 at 11:13 AM
# udo mechels said:

Thanks,

Been searching for a solution to this for quite a while!

26 November 08 at 8:15 AM
# Me said:

Vicki,

You are a genius!

This is what I was looking for!!!

byte[] bom = System.Text.Encoding.UTF8.GetPreamble();

Response.BinaryWrite(bom);

13 December 08 at 5:00 PM
# KamleshBelote said:

Hi ,

I implement this only by

Response.ContentType = "application/vnd.ms-excel";

           Response.AddHeader("content-disposition", "attachment;filename=" + ((this._exportFileName.Trim().Length > 0) ? this._exportFileName : "Noname.xls"));

           Response.Write("<meta http-equiv=Content-Type content=\"text/html; charset=utf-8\">");

Resonse.write();

02 January 09 at 5:36 AM
# ShriHari said:

This works perfectly. Thanks a lot

byte[] bom = System.Text.Encoding.UTF8.GetPreamble();

Response.BinaryWrite(bom);

24 September 09 at 11:29 AM

Leave a Comment

Comment Policy: No HTML allowed. URIs and line breaks are converted automatically. Your e–mail address will not show up on any public page.

(required) 
(optional)
(required) 

  
Enter Code Here: Required

About Saurabh Singh

I am a Support Escalation Engineer with Microsoft GTSC, India. I have been supporting IIS and ASP.Net. I am a Computer Engineer (B.E.), did my graduation from one of the premier Engineering institutes in India. I have been working in the IT field for over 6 years now.
Page view tracker