tags that would give spiders and Web robots some hints as to what they are looking at when performing their brute-force searches.
In contrast to brute-force search engines, a few Web sites, such as Yahoo and About.com, are metadata-based. Information at such sites is organized by humans based on subjects and labels. The metadata used at these sites is minimal compared with what is used in a library and with what should be used in a perfect world. Yet, the search results are superior to those produced by brute-force search engines.
The World Wide Web Consortium (W3C) recognizes the dire need for metadata on the Web, but we all realize that we can't create one unified set of metadata that will be used uniformly on the Web. Thus, the RDF (Resource Definition Framework) was created to provide a framework for creating and using metadata on the Web, rather than trying to enforce a particular mechanism for using metadata.
Both RDF's definition and the applications that use it have room to grow. RDF now provides the following features:
- Interoperability of metadata.
- Machine-understandable semantics for metadata.
- Better precision in resource discovery than full text search.
Further development of RDF will also provide a uniform query capability for resource discovery, which will enable the development of applications that leverage RDF. The RDF Interest group holds discussions on developments of RDF.
Integral to RDF are the basic concepts of a resource and a property. A resource is anything that has a URI (universal resource identifier). This includes the vast number of Web pages, as well as individual XML elements in XML pages. Resources can have one or more properties associated with them. A property itself is a resource that has a name -- for example, "Subject" or "Publisher." This design allows for extensibility, as properties themselves can have properties or be complex resources.
RDF data is best thought of as a collection of nodes, each made up of a resource/property pair. The property can be a singular value -- a text string, a number or another resource. This organization closely follows that of XML. In fact, a requirement in the RDF design is to be able to express RDF data in XML in a straightforward manner. As such, XML is the encoding syntax of RDF. Consider for example, the following XML statements describing an RDF data node:
<rdf:RDF>
<rdf:Description about="http://www.apicalconsulting.com">
<s:Creator>Ahmad Abualsamid</s:Creator>
</rdf:Description>
</rdf:RDF>
The English equivalent of this XML would be: "Ahmad Abualsamid is the creator of the resource http://www.apicalconsulting.com." So why not use just XML and do away with the added complexity of the RDF concept? Tim Bray, the co-author of XML Namespaces and a well-known authority on RDF and XML, says that because of the sheer volume of information on the Web, XML would fall short on delivering any scalability requirements as would be needed to make RDF useful.
There are two reasons behind this. First, XML cares about the order and nesting of XML elements making up the XML documents. In contrast, when your goal is to perform searches, you do not really care about the order of properties associated with the target resource. For example, while searching for a book, you do not care whether the title is listed before the author or vice versa. Second, since XML requires much more complex structures than simply a resource-property pair, the actual amount of memory and disk space required to represent the Web's meta data in XML will be too enormous.