Welcome to MSDN Blogs Sign in | Join | Help

Microsoft Excel

The team blog for Microsoft Excel and Excel Services.
SpreadsheetML News

BioIT Alliance

I haven’t spent a lot of time on the new Excel file formats, since Brian Jones is writing a blog on that subject, but I saw a story today on Brian's blog that was interesting, so I thought I would share it with you.

Specifically, he posted a story about an industry group getting ready to take advantage of the Open XML formats for their business solutions. The newly announced BioIT alliance was formed to help connect the pharmaceutical, biotechnology, hardware, and software industries. As you can imagine, Open XML formats can play a role here.  Here is an excerpt: 

"Through the BioIT Alliance, we are working closely with Microsoft to increase data access across our instrument systems and data analysis software tools using Ecma Open Office XML," said Catherine M. Burzik, president of Applied Biosystems. "This format enables life science companies to access data using the familiar Microsoft Office Excel(R) interface, providing them with the insight they need to make decisions more quickly."

This is an example of how people can use the SpreadsheetML format to automatically generate data in a much richer, interactive format.  Specifically, this is a nice demonstration of how rather than just spitting out CSV files, users and organizations can now output rich spreadsheet files with charts, tables, logos, etc.

OpenXMLDeveloper.org

While I am at it, I thought I would point out http://openxmldeveloper.org/default.aspx.  From their web site:  Announced March 21, 2006, the Open XML Formats Developer Group was initially founded by 40 organizations from around the world to provide a technical forum for developers who are interested in using the Ecma International-developed Office Open XML file formats.  Membership in the community is open to anyone free of charge to enable broad development with the formats, regardless of platform. Below are further details on the community’s goals, developers who are involved in working with the Office Open XML formats, and quotes about the value of the community and the formats.

The Open XML Formats Developer Group is being formed as a community for developers to exchange information with each other regarding the usage of the Ecma-developed Office Open XML file formats. The community will serve as a technical resource for Open XML developers to submit and answer technical questions and to share tools and ideas around Open XML Formats-based solutions. The Open XML Formats Developer Group is open to anyone free of charge to enable broad participation and development of solutions using the Open XML Formats on any platform. The Open XML Formats Developer Group will support the wide adoption of the specifications being created by Ecma Technical Committee 45.

Founding Community Members:

• 4 screen AG: http://www.4screen.com
• Acorn Systems: http://www.acornsys.com
• ACOS AS: http://www.acos.no
• APL2000 Inc.: http://www.apl2000.com
• Apple: http://www.apple.com
• Ascentn Corp.: http://www.ascentn.com
• atQuest Solution Pte Ltd.: http://atquest.com
• BP: http://www.bp.com
• Blaze SSI Corp.: http://www.blazessi.com
• Business Engine: http://www.businessengine.com
• CAPITA Education Services: http://www.capitaes.co.uk
• Certeon Inc.: http://www.certeon.com
• Colligo Networks: http://www.colligo.com
• ComponentOne LLC: http://www.componentone.com
• CyberSavvy.NET LLC.: http://www.cybersavvy.net/
• Document Sciences Corp.: http://www.docscience.com
• Essilor International: http://www.essilor.com
• Florida House of Representatives: http://www.myfloridahouse.gov
• Flowfinity Wireless Inc.: http://flowfinity.com
• Fractal Edge Ltd.: http://www.fractaledge.com
• ILOG: http://www.ilog.com/
• i-Magination Group: http://www.i-maginationgroup.com
• Intel Corporation: http://www.intel.com
• InterKnowlogy LLC: http://www.interknowlogy.com/IKCorporate
• IntelliSafe Technologies: http://www.intellisafe.com
• IP Commerce Inc.: http://ipcommerce.com
• ITVT GmbH: http://www.itvt.de
• Mathsoft: http://www.mathsoft.com
• Microsoft Corp.: http://microsoft.com
• NextPage Inc.: http://www.nextpage.com
• NuSoft Solutions Inc.: http://nusoftsolutions.com
• OBS Services and Solutions: http://www.obs.com.au
• Panorama Software: http://www.panorama.com
• RM Sistemas SA: http://www.rm.com.br
• Sonata Software Ltd.: http://www.sonata-software.com
• SourceCode Technology Holdings Inc.: http://www.k2workflow.com
• Spescom Software Inc.: http://www.spescomsoftware.com
• The Computer Solution Company: http://www.tcsc.com
• Toshiba: http://www.toshiba.com
• XINNOVATION: http://www.xinn.com

 

Posted: Thursday, April 06, 2006 9:22 AM by David Gainer

Comments

Harlan Grove said:

ECMA TC45 has reported a final spec? When?
# April 6, 2006 7:13 PM

Harlan Grove said:

Forgot to ask. How does the BioIT Alliance like Excel's current pigheadedness about treating any and all entries that look like dates as dates? For example, mangling gene sequences containing APR1 or SEP7.

Will XL12 *FINALLY* provide some means of turning any and all automatic formatting off?
# April 6, 2006 7:28 PM

Vic Eldridge said:

I whole-heartedly agree with Harlan on this one.  We send out thousands of CSV files to hundreds of clients who often use Excel to open the files, which goes ahead and mangles anything that looks like a date.  The clients then ring up complaining that we've screwed up their data.  It wasn't us that screwed up, it was Microsoft, but try telling that to hundreds of clients.  
I ended up having to write a whole new layer of software that converts our CSV files to XLS files, which also has to keep track of which clients are using Excel, and which ones aren't.

If you're not sure what Harlan was talking about regarding gene sequences, take a look here,
http://www.theregister.co.uk/2004/07/16/excel_vanishing_dna/

It's quite conceivable that people's lives could depend upon the availability and integrity of such databases.  
Would you like to save someone's life David ?
Here's your chance.

# April 6, 2006 11:06 PM

Mike said:


Not sure to understand everything.

David said "this is a nice demonstration of how rather than just spitting out CSV files, users and organizations can now output rich spreadsheet files with charts, tables, logos, etc.". Well, I think the rich COM SDK has been available for say ten years, and you can do anything with it. It does not scale without a secondary monitoring process, but that can work even on a server.

As for gene sequences, I don't get it either. SEP2 is indeed automatically replaced by Excel, but 'SEP2 isn't. Or, using automation code, writing content in cell as text simply disables Excel's auto-replace function.

# April 7, 2006 1:29 AM

Mike said:


In retrospect, while In understand Excel XP/2003 SS/SpreadsheetML does not support chart, pictures and drawing objects, why would anyone want to use XML instead of XLS anyway when XML is only understood by a special edition of Excel version?

To accomodate deployment problems, one would stick with XLS files.

What changes with Excel 2007 is that developers will be able to generate XML without losing the rich objects, but there are two big shortcomings to this :
- deployment : unless users install the converters, there is no way they are going to be able to do anything meaningful with .xlsx files.
- no support for some of the most striking features of Excel 2007 : running say Excel 2000 with a .xlsx converter won't let you see databars for example. So if someone uses databars instead of a separate chart and then share it across the enterprise or outside, all what he's doing is create a file chaos. Not very shiny perspective IMHO.

# April 7, 2006 1:35 AM

Harlan Grove said:

Yo Mike! Do you really believe Microsoft has spent so much development time & money on XL12 and O12 generally to be happy if as few people upgrade to it as upgraded to XL11/O2003? This is intended as a 'must have' update.

Since the company I work for is still hasn't completed the Office 97 to Office XP update for all employees, I don't have to worry about disabling the ribbon for at least 5 years.

And the rationale for XML formats is allowing other applications to use the data in the files more easily. The .xls file format isn't one of the world's best examples of clarity and straightforwardness.
# April 7, 2006 2:14 AM

Mike said:


I think the rationale is to move everybody in the enterprise to the server license, as a way to achieve a consistent user experience (limited to viewing though) and avoid the costly rich client upgrade mess (even if only the converters are deployed).

As for XML, I don't worry much about that either. You soon realize that the XML is only a re-serialized version of the original binary file formats, with data and layout mixed together. And, as a consequence, if you want to make any change in XML programmatically outside their customXML thing (essentially a dataset represented as an XML fragment) then you'll find yourself out of luck : you need the actual Office app instances to recompute values, repaginate documents, etc.

Furthermore, I don't buy this ZIP-XML thing at all from a programming point of view. Why not use the official COM APIs? And of course, from a user point of view, all I see are new file extensions...

# April 7, 2006 3:25 AM

David Gainer said:

Hi folks

Harlan, I have not heard an official date yet.  From what I have heard, the group is aiming to be done by the end of the year, but it is really up to the group.  Can you give me an idea of what you mean by automatic formatting?  If you mean what I think you mean, there is an option in Tools|Options|Edit called “extend data rage formats and formulas”.  Harlan, Vic, your comment about gene sequences has been duly noted – I definitely saw that article a few times when it came out.  That said, using the XML format solves the problem you bring up, which could be an example of why they are interested.

Mike, Harlan, with regard to the XML file formats.  Harlan is correct, in that one of the benefits of the format is that files can be easily created or changed from other apps (meaning Office does not have to be on the machine, or that you can be running a Linux box for example) in a straightforward manner.  Here is an example of how to build a file with a table.  http://blogs.msdn.com/brian_jones/archive/2005/06/27/433152.aspx.  That’s a lot simpler than it is currently.

Additionally, Harlan actually brings up a great example of why this is more efficient than current methods of moving around data (.csv, .txt files).  Currently, why do a lot of .txt files get opened in Excel?  Because a user wants to do something with some data that comes out of another system, and the other system cannot write .XLS files (because they are so complicated, because the underlying OS wont run XL, etc., etc.).  With the XML/zip-based formats, it is very easy to write the code to generate the file, and the code can be written on any OS, because both XML and zip are well supported on most any platform.  The best part is, because it is an Excel file, and not a text file, the person dumping the data can include information like whether SEP2 is a date or a gene sequence.  (You can fairly take issue with the way Excel guesses at dates, but the real culprit is a lack of information from the data transport technology).

We will be releasing converters for all supported versions of Office, so Office 2000 users will be able to open these files.  True, they wont have data bars, but they will see gene sequences and not dates.

Have a great weekend.  
# April 7, 2006 1:53 PM

Harlan Grove said:

' . . . Tools|Options|Edit called “extend data rage formats and formulas”.'

Nice idea. Doesn't work. I'm assuming this is the option named "Extend list formats and formulas" in XL10. If the Normal style's number format is General, Excel blissfully converts all entries that look similar to dates into date values. This is so whether that particular option is enabled or disabled.

For comparison, Lotus 123 and Gnumeric both accept the entries APR1 and SEP5 as text. OpenOffice Calc converts these entries to date values and provides no setting to diable this 'functionality', just like Excel. However, unlike Excel (and MUCH, MUCH MORE USEFULLY), OpenOffice Cals *ALWAYS* displays its text import wizard when opening *ANY* text file, *INCLUDING* text files with .CSV entensions. Would that there were an option to tell Excel to do the same thing.
# April 7, 2006 5:35 PM

Harlan Grove said:

And in re CSV files vs XML, as long as there are mainframes running 30+ year-old COBOL programs, there will continue to be CSV files and text files with fixed field and record lengths. XML may be better than .XLS, .DOC and .PPT, but it's not necessarily better than plain text for tabular data.

The intent behind XML is that it kill off binary file formats. There's no hue & cry to do away with plain text. The only problems with plain text are those caused by software that's too clever by half, such as Excel. Blaiming plain text for Excel mangling gene sequences makes as much sense as blaiming the dinosaurs for causing global warming because they decomposed into fossil fuels.
# April 7, 2006 5:49 PM

Harlan Grove said:

Off-topic spcifically, but still Excel related.

Will XL12 fix reference ambiguity? By that I mean the following scenario. A workbook named x.xls with a single worksheet named x. While it's open, copy it's A1 cell, switch to another workbook (y.xls), and paste as a link in cell Sheet1!A1. The formula appears as =x.xls!$A$1. Now create another worksheet in y.xls named x.xls. Copy that worksheet's A1 cell, and paste a link in Sheet1!A2. That formula also appears as =x.xls!$A$1.
# April 7, 2006 6:56 PM

Cutedbm said:

Hi Dave, I'm from China and I just dont know where to report this problem.

My mouse is IE4.0. In Excel 2003, I can use the 4-Way Scrolling function but in Excel 2007 I can't  scroll right and left.

Could you contact the hardware department about this problem? Thanks a lot ...

# April 8, 2006 6:33 AM

Cutedbm said:

Hi Dave, I'm from China and I just dont know where to report this problem.

My mouse is IE4.0. In Excel 2003, I can use the 4-Way Scrolling function but in Excel 2007 I can't  scroll right and left.

Could you contact the hardware department about this problem? Thanks a lot ...

# April 8, 2006 6:35 AM

David Gainer said:

Harlan, I think we are agreeing, although there is an interesting scenario now open where you can easily generate files that contain tabular data and other things (calc columns, charts, etc.).

Also, we have apparently fixed reference ambiguity - I just tried your scenario, and it references sheet x in y.

Cutebdm, I will pass that along.
# April 10, 2006 2:35 PM

Harlan Grove said:

Re reference ambiguity,

"Also, we have apparently fixed reference ambiguity - I just tried your scenario, and it references sheet x in y."

The worksheet in file y.xls is named x.xls. It's file x.xls that has only one worksheet named x. So in XL12 if one enters the following formula in any cell in y.xls

=[x.xls]x!A1

it now remains =[x.xls]x!A1? Good.
# April 11, 2006 12:13 PM

David Gainer said:

well, it becomes x.xlsx!A1, and returns the correct value.
# April 11, 2006 2:29 PM

Harlan Grove said:

You'd need to change the name of the worksheet in the file y.xls (or is it y.xlsx?) to x.xlsx. I believe when you do so, you'll once again have an internal reference in one formula and an external reference in the other formula that both appear - AFTER ENTRY - as

=x.xlsx!A1

If so, the reference ambiguity lives on.
# April 11, 2006 6:54 PM

Microsoft Excel 2007 (nee Excel 12) said:

To this point in the blog, I haven't talked too much about the file formats that Excel 2007 uses, since...
# July 20, 2006 5:39 PM
New Comments to this post are disabled
Page view tracker