Newsletter for e-Business consultants and practitioners January-February 2000
Editors: Eldar A. Musayev

Content

XML - eXtensible Markup Language: serving suggestions What it is?
Separation of data and presentation layers
Server-side generated HTML
Loose coupling integration of heterogeneous systems (applications)
Flexible format for data transfer / communication protocols
Data storage format
XML in a browser
References
Internet
Books

XML - eXtensible Markup Language: serving suggestions

For dining you should have a knife on the right and fork on the left, a special fork should be placed behind the plate and when it comes to a spoon… Why? Because otherwise it would be "muavais ton" - which means poor, bad or inappropriate manners, tone, behavior. Here, at hi-tech, we are fortunate enough to be free from the français étiquette because breaking it demonstrated impressive efficiency too many times. However, it's still worth to follow some basic rules of civilized behavior, like, for example, not to talk about something you have no idea about. Looks like at XML table we got a bunch of poorly mannered guys, who forgot about this rule in favor of a sales pitch. So, here we are, and this is a kind of overview on what can be done with XML, and what is, probably, not the best idea (like eating soup with a fork and knife). Of course, there is always more than one opinion, so don't afraid to put a knife on the left, just don't forget to use it as appropriate.

What it is?

XML is a mark-up language like HTML. Like HTML it is defined by the means of the SGML, which is mark-up language to define other mark-up languages. You can get official XML specifications at the same place where HTML specs are: on the site of W3 group. Originally XML was created because HTML provided insufficient expressive power, and complexity of SGML required too much power to use it. XML allows to define new languages (in fact, XML dialects), which can be suited to serve particular purposes. Moreover, XML per se is not a thing you use, but these home-made dialects is what brings special power to this language. So, which purposes?

  1. Separation of data and presentation layer.
  2. Server-side generated HTML (which also helps to separate data and presentation layer plus something extra).
  3. Loose coupling integration of heterogeneous systems (applications).
  4. Flexible format for data transfer / communication protocols.
  5. In some cases, data storage format.

Separation of data and presentation layers

Even simplest eBusiness systems require database integration. Static HTML page usually does not represent special value, so many pages are generated on a server "on-fly" based on the back-end data. For a long time basic model was to generate HTML code for dynamic pages and fill this HTML code with data. This was done by CGI programs or their equivalents, like Microsoft ISAPI or Visual Basic ASP modules. These programs accessed databases and produced HTML output, merging data and presentation within their code. Now imagine that you have such system with about 100 such modules/programs, and your database just got additional field that you want to show. How to add it consistently? Or suppose you decided to change all tables layout to make them more readable. Ouch…

Now, how it could be done with XML? You web-programs generate XML output like:

<?xml version="1.0" ?> <?xml-stylesheet type="text/css" href="hr.css" ?> <humanresources> <man> <name>John Smith</name> <IQ>stupid</IQ> </man> <man> <name>Smith John</name> <IQ>moron</IQ> </man> <man> <name>Eldar Musayev (that's me)</name> <IQ>genius</IQ> </man> </humanresources>

And now add formal description how to convert it into HTML. There is a number of options for that:

  1. CSS - Cascading Style Sheets
  2. eXtensible Stylesheet Language - XSL
  3. Custom built components

We will show an example with CSS here. For CSS and XSL your application server probably can make conversion on the server side. If not, you can get off-shelf product and use it with your application server. The important thing is you don't have to code, unless you want to. So let make a simple stylesheet like:

man { display: block; margin: 10pt; } name { display: inline; font-size: 10pt; } IQ { display: inline; color: red; font-size: 12pt; font-weight: bold; }

Now, if you have Internet Explorer 5+ you can try to see how it looks like. The XML example should appear this way:

John Smith stupid
Smith John moron
Eldar Musayev (that's me) genius

XSL allows even better processing of XML data, inclusion of HTML tags, sorting data and so on. But whatever way is used the key is the same, to change the style you should change it in a single point. Data are separate from the presentation. Change of data has a minimal effect on the presentation layer, and change of look does not affect data.

Server-side generated HTML

The technique for this is close to described in previous section, but it gives more. Suppose you want a single look all across your web-site. How it can be achieved?

The evident answer is to use templates. Suppose, your pages look like:

<page><header>Your title</header> <body><top-advertising><navigation-bar> <main-column> … individual content… </main-column> <side-column> <side-ad/> … individual content… </side-column> </body></page>

And through standard application and description in XSL, or through your own application, you can replace elements like <header>, <main-column>, <top-advertising> with appropriate HTML code.

Important thing is that you have centralized control over standard elements. For example element <side-ad> may control the advertising code placed on the side column. Now when you get a new advertiser or you lose some advertiser, you just update your database and all pages start to show ads correctly. If you ever built a site with a number of ads, you can understand the value of such improvement.

Loose coupling integration of heterogeneous systems (applications)

If you deal with EAI (Enterprise Application Integration) or even just have a need to integrate two applications, there is always a lot of ways to do that. But the important thing is that you don't want too tight connection. Applications tend change and if your connectors are not flexible enough, you will have to change them each time the applications are changed.

For example, suppose you want to integrate HR file with your web, to show public data about your staff. HR system with your connector will give the XML file, like the one used above. Your web system will automatically convert it to HTML and show. Now suppose that your connector from HR system (because of upgrade) started to give up Social Security Numbers, which are apparently not for publication. If conversion is described correctly, your web even will not notice that the new field appeared, it will just ignore it. That's a big difference to most other ways to integrate applications, which could require change to the code and testing.

Caveats include potential performance problems. If you pass several megabytes at a time over fast LAN, conversion may add a noticeable toll to the time of a single call. Another problem may appear if you send XML to the client, to be converted to HTML in the browser. In this case you still have to take care of sensitive data, even though users don't see them directly. However such change is still easy than in a case of a hard integration.

Flexible format for data transfer / communication protocols

Close to previous advantage is ability to send data across networks with acceptable level of tolerance to change on both ends. If you use XML, any extra fields are just ignored, and missed data may be defaulted, therefore allowing enough time to implement change, or even just ignore the change. Basically this is also integration, which just happens over network.

Data storage format

This is probably not the #1 use of XML, however it also has its' place. Of course, you would not want to place terabyte database in such format, it's too expensive both in terms of a disk space and performance. However personal library of CD records or books on your computer can be easily stored in XML format. Advantages include (but not limited to):

  1. XML is text-based, you can see what is it in your database.
  2. Highly tolerable to missing data and non-restrictive to the content and field sizes.
  3. Easy (i.e. cheap and fast) to implement with a lot of off-shelf XML readers, including activeX control for Visual Basic and standard packages for Java

XML in a browser

It is likely that with a time, when XSL standard will become more stable, and more browsers will support XML/XSL, this will become very interesting. In fact even now XML can be used in business-to-business applications when developers can dictate which browser to use to access their system. It's another matter that any additional restriction adds work for the support and help desk. And unstable standard may make it even more worse. For example, can we be sure that new version of IE or JVM (Java Virtual Machine) will not affect this functionality? And how to control that?

At the same time we know a company where this approach was successfully used. On the other hand alternative was there own fat client, which can create even more trouble than more or less standard browser can do. So, we cannot advise for or against such approach, we just list cons and pros, and you have to decide for yourself. One thing is for certain, XML in browser currently is not acceptable solution for public Internet sites.

CONSPROS
  • Most browsers don't support this technology yet.
  • Internet Explorer 5 supports this technology, but in a specific way, which may change in a future.
  • It is still not clear if XSL technology will become common. Despite all advantages, it is still too heavy to use. That does not mean that it will fail, but it will not necessary win. For example, Java was supposed to be used on a client side, and currently we see a minimal use of Java in this role.
  • You pass smaller files (XML without any formatting).
  • Automated agent can be connected on the client side (getting data from XML is easy, from HTML it's usually relatively complex)
  • You get a cool high-demanded skill J

References

Internet

Four most important sources on XML are listed here. And they does not have the hipe, mentioned in the beginning of this paper, these guys really know what they do.

Books

Here are few of XML books, if you need more just search for "XML" on Amazon.com, Borders or Barnes&Noble.

  1. XML IE5 Programmer's Reference Alex Homer / Mass Market Paperback / Published 1999
  2. XML: A Manager's Guide (Addison-Wesley Information Technology Series) Kevin Dick / Paperback / Published 1999
  3. XML Pocket Reference Robert Eckstein / Paperback / Published 1999
  4. Applied XML: A Toolkit for Programmers Alex Ceponkus, Faraz Hoodbhoy / Paperback / Published 1999
  5. Beyond Html Richard Karpinski / Paperback / Published 1996
  6. Building Web Sites with XML Michael Floyd / Textbook Binding / Published 1999
  7. Building Xml Applications Simon St. Laurent, et al / Paperback / Published 1999
  8. Client/Server Data Access With Java and Xml Dan Chang, Dan Harkey / Paperback / Published 1998
  9. Data on the Web : From Relations to Semistructured Data and Xml Serge Abiteboul, et al / Hardcover / Published 1999
  10. Designing Distributed Applications With Xml : Asp Ie5 Ldap and Msmq Stephen F. Mohr / Paperback / Published 1999
  11. Designing Xml Internet Applications (Charles F. Goldfarb Series) Michael Leventhal, et al / Paperback / Published 1998
  12. Inside XML DTDs: Scientific and Technical Simon St. Laurent, et al / Paperback / Published 1999
  13. Just Xml John E. Simpson / Paperback / Published 1998
  14. Platinum Edition Using HTML 4, XML, and Java 1.2 Eric Ladd, Jim O'Donnell / Hardcover / Published 1998
  15. Practical Guide to SGML/XML Filters Norman E. Smith / Paperback / Published 1998
  16. Presenting Xml Richard Light, Tim Bray / Paperback / Published 1997
  17. Professional Java Server Programming: with Servlets, JavaServer Pages (JSP), XML, Enterprise JavaBeans (EJB), JNDI, CORBA, Jini and Javaspaces Andrew Patzer, et al / Mass Market Paperback / Published 1999
  18. Professional Java XML Programming with servlets and JSP Myers, et al / Paperback / Published 2000
  19. Professional Stylesheets for Html and Xml Frank Boumphrey / Mass Market Paperback / Published 1998
  20. Sams Teach Yourself Xml in 21 Days (Teach Yourself...) Simon North / Paperback / Published 1999
  21. Structuring Xml Documents (Charles F. Goldfarb Series) David Megginson / Paperback / Published 1998
  22. Teach Yourself Xml Sandra E. Eddy, John E. Schnyder / Paperback / Published 1999
  23. Xml : A Primer Simon St. Laurent, Simon St Laurent / Paperback / Published 1999
  24. Xml : Extensible Markup Language Elliotte Rusty Harold / Paperback / Published 1998
  25. XML and Java: Developing Web Applications Hiroshi Maruyama, et al / Paperback / Published 1999
  26. The XML and SGML Cookbook : Recipes for Structured Information (Charles F. Goldfarb Series) Rick Jelliffe, Richard A. Jelliffe / Paperback / Published 1998
  27. XML Applications Frank Boumphrey, et al / Paperback / Published 1998
  28. XML Bible Elliotte Rusty Harold / Paperback / Published 1999
  29. XML by Example (By Example) Benoit Marchal / Paperback / Published 1999
  30. Xml by Example : Building E-Commerce Applications (Charles F. Goldfarb Series on Open Information Management) Sean McGrath / Paperback / Published 1998
  31. The XML Companion, Second Edition Neil Bradley / Paperback / Published 1999
  32. Xml Design and Implementation Paul Spencer / Paperback / Published 1999
  33. XML For Dummies® Quick Reference Mariva H. Aviram / Paperback / Published 1998
  34. The XML Handbook - 2nd Edition Paul Prescod, Charles F. Goldfarb / Paperback / Published 1999
  35. Xml Specification Guide Ian S. Graham, Liam Quin / Paperback / Published 1999
  36. XML Unleashed (Unleashed) Michael Morrison / Paperback / Published 1999
  37. XML: A Primer Simon St. Laurent, et al / Paperback / Published 1998
  38. XML: The Annotated Specification Bob DuCharme / Textbook Binding / Published 1998