Written by Sam Vaknin

The Internet is too rich. Even powerful and sophisticated search engines, such as Google, return a lot of trash, dead ends, and Error 404's in response to the most well-defined query, Boolean operators and all. Directories created by human editors - such as Yahoo! or the Open Directory Project - are often overwhelmed by the amount of material out there. Like the legendary blob, the Internet is clearly out of classificatory control. Some web sites - like Suite101 - have introduced the old and tried Dewey subject classification system successfully used in non-virtual libraries for more than a century. Books - both print and electronic - (actually, their publishers) get assigned an ISBN (International Standard Book Number) by national agencies. Periodical publications (magazines, newsletters, bulletins) sport an ISSN (International Serial Standard Number). National libraries dole out CIP's (Cataloguing in Publication numbers), which help lesser outfits to catalogue the book upon arrival. But the emergence of new book formats, independent publishing, and self publishing has strained this already creaking system to its limits. In short: the whole thing is fast developing into an awful mess.

Resolution is one solution.

Resolution is the linking of identifiers to content. An identifier can be a word, or a phrase. RealNames implemented this approach and its proprietary software is now incorporated in most browsers. The user types a word, brand name, phrase, or code, and gets re-directed to a web site with the appropriate content. The only snag: RealNames identifiers are for sale. Thus, its identifiers are not guaranteed to lead to the best, only, or relevant resource. Similar systems are available in many languages. Nexet, for example, provides such a resolution service in Hebrew.

The Association of American Publishers (APA) has an Enabling Technologies Committee. Fittingly, at the Frankfurt Book Fair of 1997, it announced the DOI (Digital Object Identifier) initiative. An International DOI Foundation (IDF) was set up and invited all publishers - American and non-American alike - to apply for a unique DOI prefix. DOI is actually a private case of a larger system of "handles" developed by the CNRI (Corporation for National Research Initiatives). Their "Handle Resolver" is a browser plug-in software, which re-directs their handles to URL's or other pieces of data, or content. Without the Resolver, typing in the handle simply directs the user to a few proxy servers, which "understand" the handle protocols.

The interesting (and new) feature of the system is its ability to resolve to MULTIPLE locations (URL's, or data, or content). The same identifier can resolve to a Universe of inter-related information (effectively, to a mini-library). The content thus resolved need not be limited to text. Multiple resolution works with audio, images, and even video.

The IDF's press release is worth some extensive quoting:

"Imagine you're the manager of an Internet company reading a story online in the "Wall Street Journal" written by Stacey E. Bressler, a co-author of Communities of Commerce, and at the end of the story there is a link to purchase options for the book.
Now imagine you are an online retailer, a syndicator or a reporter for an online news service and you are reading a review in "Publishers Weekly" about Communities of Commerce and you run across a link to related resources.
And imagine you are in Buenos Aires, and in an online publication you encounter a link to "D-Lib Magazine", an electronic journal produced in Washington, D.C. which offers you locale-specific choices for downloading an article.
The above examples demonstrate how multiple resolution can present you with a list of links from within an electronic document or page. The links beneath the labels - URLs and email addresses - would all be stored in the DOI System, and multiple resolution means any or all of those links can be displayed for you to select from in one menu. Any combination of links to related resources can be included in these menus.
Capable of providing much richer experiences then single resolution to a URL, Multiple Resolution operates on the premise that content, not its location, is identified. In other words, where content and related resources reside is secondary information. Multiple Resolution enables content owners and distributors to identify their intellectual property with bound collections of related resources at a hyperlink's point of departure, instead of requiring a user to leave the page to go to a new location for further information.
A content owner controls and manages all the related resources in each of these menus and can determine which information is accessible to each business partner within the supply chain. When an administrator changes any facet of this information, the change is simultaneous on all internal networks and the Internet. A DOI is a permanent identifier, analogous to a telephone number for life, so tomorrow and years from now a user can locate the product and related resources wherever they may have been moved or archived to."

The IDF provides a limited, text-only, online demonstration. When sweeping with the cursor over a linked item, a pop-down menu of options is presented. These options are pre-defined and customized by the content creators and owners. In the first example above (book purchase options) the DOI resolves to retail outlets (categorized by book formats), information about the title and the author, digital rights management information (permissions), and more. The DOI server generates this information in "real time", "on the fly". But it is the author, or (more often) the publisher that choose the information, its modes of presentation, selections, and marketing and sales data. The ingenuity is in the fact that the DOI server's files and records can be updated, replaced, or deleted. It does not affect the resolution path - only the content resolved to.

Which brings us to e-publishing.

The DOI Foundation has unveiled the DOI-EB (EB stands for e-books) Initiative in the Book Expo America Show 2001, to, in their words:

"Determine requirements with respect to the application of unique identifiers to eBooks
Develop proofs-of-concept for the use of DOIs with eBooks
Develop technical demonstrations, possibly including a prototype eBook Registration Agency."

It is backed by a few major publishers, such as McGraw-Hill, Random House, Pearson, and Wiley.

This ostensibly modest agenda conceals a revolutionary and ambitious attempt to unambiguously identify the origin of digital content (in this case, e-books) and link a universe of information to each and every ID number. Aware of competing efforts underway, the DOI Foundation is actively courting the likes of "indecs" (Interoperability of Data in E-Commerce System) and OeBF (Open e-Book). Companies ,like Enpia Systems of South Korea (a DOI Registration Agency), have already implemented a DOI-cum-indecs system. On November 2000, the APA's (American Publishers' Association) Open E-book Publishing Standards Initiative has recommended to use DOI as the primary identification system for e-books' metadata. The MPEG (Motion Pictures Experts Group) is said to be considering DOI seriously in its efforts to come up with numbering and metadata standards for digital videos. A DOI can be expressed as a URN (Universal Resource Name - IETF's syntax for generic resources) and is compatible with OpenURL (a syntax for embedding parameters such as identifiers and metadata in links). Shortly, a "Namespace Dictionary" is to be published. It will encompass 800 metadata elements and will tackle e-books, journals, audio, and video. A working group was started to develop a "services definition" interface (i.e., to allow web-enabled systems, especially e-commerce and m-commerce systems, to deploy DOI).

The DOI, in other words, is designed to be all-inclusive and all-pervasive. Each DOI number is made of a prefix, specific to a publisher, and a suffix, which could end up painlessly assimilating the ISBN and ISSN (or any other numbering and database) system.

Thus, a DOI can be assigned to every e-book based on its ISBN and to every part (chapter, section, or page) of every e-book. This flexibility could support Pay Per View models (such as Questia's or Fathom's), POD (Print On Demand), and academic "course packs", which comprise material from many textbooks, whether on digital media or downloadable. The DOI, in other words, can underlie D-CMS (Digital Content Management Systems) and Electronic Catalogue ID Management Systems.

Moreover, the DOI is a paradigm shift (though, conceptually, it was preceded by the likes of the UPC code and the ISO's HyTime multimedia standard). It blurs the borders between types of digital content. Imagine an e-novel with the video version of the novel, the sound track, still photographs, a tourist guide, an audio book, and other digital content embedded in it. Each content type and each segment of each content type can be identified and tagged separately and, thus, sold separately - yet all under the umbrella of the same DOI! The nightmare of DRM (digital rights management) may be finally over.

But the DOI is much more than a sophisticated tagging technology. It comes with multiple resolution (see "Embarrassment of Riches - Part I"). In other words, as opposed to the URL (Universal Resource Locator) - it is generated dynamically, "on the fly", by the user, and is not "hard coded" into the web page. This is because the DOI identifies content - not its location. And while the URL resolves to a single web page - the DOI resolves to a lot more in the form of publisher-controlled (ONIX-XML) "metadata" in a pop-up (Javascript or other) screen. The metadata include everything from the author's name through the book's title, edition, blurbs, sample chapters, other promotional material, links to related products, a rights and permissions profile, e-mail contacts, and active links to retailers' web pages. Thus, every book-related web page becomes a full fledged book retailing gateway. The "anchor document" (in which the DOI is embedded) remains uncluttered. ONIX 2.0 may contain standard metadata fields and extensions specific to e-publishing and e-books.

This latter feature - the ability to link to the systems of retailers, distributors, and other types of vendors - is the "barcode" function of the DOI. Like barcode technology, it helps to automate the supply chain, and update the inventory, ordering, billing and invoicing, accounting, and re-ordering databases and functions. Besides tracking content use and distribution, the DOI allows to seamlessly integrate hitherto disparate e-commerce technologies and facilitate interoperability among DRM systems.

The resolution itself can take place in the client's browser (using a software plug-in), in a proxy server, or in a central, dynamic server. Resolving from the client's PC, e-book reader, or PDA has the advantage of being able to respond to the user's specific condition (location, time of day, etc.). No plug-in is required when a proxy server HTTP is used - but then the DOI becomes just another URL, embedded in the page when it is created and not resolved when the user clicks on it. The most user-friendly solution is, probably, for a central server to look up values in response to a user's prompt and serve her with cascading menus or links. Admittedly, in this option, the resolution tables (what DOI links to what URL's and to what content) is not really dynamic. It changes only with every server update and is static between updates. But this is a minor inconvenience. As it is, users are likely to respond with some trepidation to the need to install plug-ins and to the avalanche of information their single, innocuous, mouse click generates.

The DOI Foundation has compiled this impressive list of benefits - and beneficiaries:

"Publishers to enable cross referencing to related information, control over metadata, viral distribution and sales, easy access to content, sale of granular content;
Consumers to increase value for time and money, and purchase options;
Distributors to facilitate sale and distribution of materials as well as user needs;
Retailers to build related materials on their sites, heighten consumer usability and copyright protection;
Conversion Houses/Wholesaler Repositories to increase access to and use of metadata;
DRM Vendors/Rights Clearing Houses to enable interoperability and use of standards;
Data Aggregators to enable compilation of primary and secondary content and print on demand;
Trade Associations facilitate dialog on social level and attend to legal and technical perspectives pertaining to multiple versions of electronic content;
eBbook software Developers to enable management of personal collections of eBooks including purchase receipt information as reference for quick return to retailer;
Content Management System Vendors to enable internal synching with external usage;
Syndicators to drive sales to retailers, add value to retail online store/sales, and increase sales for publishers."
The DOI is assigned to publishers by Registration Agencies (of which there are currently three - CrossRef and Content Directions in the States and the aforementioned Enpia Systems in Asia). It is already widely used to cross reference almost 5,000 periodicals with a database of 3,000,000 citations. The price is steep - it costs a publisher $200 to get a prefix and submit DOI's to the registry. But as Registration Agencies proliferate, competition is bound to slash these prices precipitously.

samvak

