3.18 Future of the Web
“Web 2.0 will make the cover of Time
magazine, and thus its moment in the sun will have passed. However,
the story that drives Web 2.0 will only strengthen, and folks
will cast about for the next best name for the phenomenon.”
—John Battelle1
“We’re a long way
from the full realization of the potential of intelligent systems, and
there will no doubt be a tipping point where the systems get smart enough
that we'll be ready to say, ‘this is qualitatively
different. Let’s call it Web 3.0.’”
—Tim O’Reilly2
The XHTML
coding on websites defines their structure and layout, specifying
colors, fonts, sizes, use of bold and italic, paragraphs, tables and
the like, but not specifying the meaning of the data on the page. Web
1.0 servers sent mostly static web pages coded in HTML or XHTML to browsers
that rendered the pages on the screen. Web 2.0 applications are more
dynamic, generally enabling significant interaction between the user
(the client) and the computer (the server), and among communities of
users.
Computers have a hard time deciphering meaning from XHTML
content. The web today involves users’
interpretations of what pages and images mean, but the future entails
a shift from XHTML to a more sophisticated system based on XML,
enabling computers to better understand
meaning.
Web 2.0 companies use “data mining” to extract
as much meaning as they can from XHTML-encoded pages. For example, Google’s
AdSense contextual advertising program does a remarkable job placing
relevant ads next to content based on some interpretation of the meaning
of that content. XHTML-encoded
content does not explicitly convey meaning, but XML-encoded
content does. So if we can encode in XML
(and derivative technologies) much or all of the content on the web,
we’ll take a great leap forward towards realizing the Semantic
Web.
It is unlikely that web developers and users will directly
encode all web content in XML—it’s simply too tedious and
probably too complex for most web designers. Rather, the XML encoding
will occur naturally as a by-product of using various content creation
tools. For example, to submit a resume on a website, there may be a
tool that enables the user to fill out a form (with first name, last
name, phone number, career goal, etc.). When the resume is submitted,
the tool could create a computer readable microformat that could easily
be found and read by applications that process resumes. Such tools might
help a company find qualified potential employees, or help a job seeker
who wants to write a resume find resumes of people with similar qualifications).
Tagging and Folksonomies
Tagging and folksonomies are early hints of a “web
of meaning.” Without tagging, searching for a picture on Flickr
would be like searching for a needle in a giant haystack. Flickr’s
tagging system allows users to subjectively tag pictures with meaning,
making photos findable by search engines. Tagging is a “loose”
classification system, quite different, for example, from using the
Dewey Decimal System for cataloging books, which follows a rigid taxonomy
system, limiting your choices to a set of predetermined categories.
Tagging is a more “democratic” labeling system that allows
people, for example, to associate whatever meanings they choose with
a picture (e.g. who is in the picture, where it was taken, what is going
on, the colors, the mood, etc.).
Semantic Web
“People keep asking what Web 3.0 is.
I think maybe when you've got an overlay of scalable vector graphics—everything
rippling and folding and looking misty—on Web
2.0 and access to a semantic Web integrated across a huge space of data,
you'll have access to an unbelievable data resource.”
—Tim Berners-Lee3
“The Holy Grail for developers of the
semantic Web is to build a system that can give a reasonable
and complete response to a simple question like: I’m
looking for a warm place to vacation and I have a budget of $3,000.
Oh, and I have an 11-year-old child
Under
Web 3.0, the same search would ideally call up a complete vacation
package that was planned as meticulously as if it had been assembled
by a human travel agent.”
—John Markoff4
Many people consider the Semantic
Web to be the next generation in web development, one that
helps to realize the full potential of the web. This is Tim Berners-Lee’s original
vision of the web, also known as the “web
of meaning.”5 Though Web 2.0 applications are finding meaning in content, the
Semantic Web will attempt to make those meanings clear to computers
as well as humans. It will be a web able to answer complex and subtle
questions.
Realization of the Semantic Web depends heavily on XML and XML-based technologies (seeChapter 14), which help make
web content more understandable to computers. Currently, computers “understand”
data on basic levels, but are progressing to find meaningful connections
and links between data points. The emerging Semantic Web technologies
highlight new relationships among web data. Some experiments that emphasize
this are Flickr
and FOAF (Friend of a Friend), a research project that “is creating
a Web of machine-readable pages describing people, the links between
them and the things they create and do.”6 Programming in both instances involves links between databases—ultimately
allowing users to share, transfer, and use each other’s information
(photos, blogs, etc.).7
Preparations for the Semantic Web have been going on for
years. XML is already widely used in both online and offline applications,
but still only a minute portion of the web is coded in XML or derivative
technologies. Many companies, including Zepheira,
an information management company, and Joost, an Internet TV provider, already
use semantic technologies in working with data. Deterring Semantic Web
development are concerns about the consequences of false information
and the abuse of data. Since the Semantic Web will rely on computers
having greater access to information and will yield a deeper understanding
of its significance, some people worry about the potentially increased
consequences of security breaches. The Policy
Aware Web Project is an early attempt at developing standards
to encourage data sharing by providing access policies that can sufficiently
protect individuals’ privacy concerns.8
Microformats
“We need microformats that people agree
on.”
— Bill Gates, MIX06 conference9
Some people look at the web and see lots of “loose”
information. Others see logical aggregates, such as business cards,
resumes, events and so forth. Microformats
are standard formats for representing information aggregates that can
be understood by computers, enabling better search results and new types
of applications. The key is for developers to use standard microformats,
rather than developing customized, non-standard data aggregations. Microformat
standards encourage sites to similarly organize their information, thus
increasing interoperability. For example, if you want to create an event
or an events calendar, you could use the hCalendar microformat. Some
other microformats are adr for address information, hresume for resumes, and xfolk for collections of bookmarks.10
Resource Description Framework (RDF)
The Resource Description Framework
(RDF), developed by the World Wide Web Consortium (W3C),
is based on XML and used to describe content in a way that is understood
by computers. RDF helps connect isolated databases across the web with
consistent semantics.11 The structure of any expression in RDF is a collection of triples.12 RDF triples consist of two pieces
of information (subject and object) and a linking fact (predicate).
Let’s create a simple RDF triple. “Chapter 3, Dive
Into® Web 2.0” is the title of this document
and one property (the document’s subject) that we’ll use
in our RDF triple. Another property of this chapter is “Deitel”
as the author. So the sentence “Chapter 3, Dive Into®
Web 2.0 is written by Deitel” is an RDF triple, containing
two properties and a linking fact (“is written by”).
DBpedia.orgis currently transferring content into RDF from Wikipedia,
one of the largest and most popular resources of online information.
Using SPARQL (SPARQL Protocol and RDF Query
Language), DBpedia.org
is converting data from Wikipedia entries into RDF triples. In June
2007, they claimed to have over 91 million triples—this will allow
the information (from Wikipedia) to be accessed by more advanced search
queries.13
Ontologies
Ontologies are ways of organizing
and describing related items, and are used to represent semantics. This
is another means of cataloging Internet content in a way that can be
understood by computers.14 RDF is designed for formatting ontologies. OWL
(Web Ontology Language), also designed for formatting ontologies
in XML, extends beyond the basic semantics of RDF ontologies to enable
even deeper machine understanding of content.15
Closing Comment
This book will get you up to speed on Web
2.0 applications development. Building a “web of meaning”
will ultimately open a floodgate of opportunities for web developers
and entrepreneurs to write new applications, create new kinds of businesses,
etc. We don’t know exactly what the “web of meaning”
will look like, but it’s starting to take shape. If it helps accomplish
what many leaders in the web community believe is possible, the future
of the web will be exciting indeed.