Video, Computer-Generated Environments and the Future of the Web

Video, Computer-Generated Environments

and the Future of the Internet

By Ian Lamont

(For graduate credit)

HUMA E-105: Survey of Publishing, from Text to Hypertext

Harvard University Extension School

January 16, 2008

1

Almost since the first stuttering video clips appeared on the World Wide Web,

observers have predicted that video will come to dominate the Internet. Mitchell

Stephens, writing in the mid-1990s, foresaw the rise of sophisticated video production

and narrative techniques derived in part from the “merger” of computers and video.1 He

also believed the Web would play an important role for video, primarily as an on-demand

distribution platform that would allow viewers to be finally freed from television

schedules.2 Another commentator, writing more recently about the future of the Internet,

proclaimed video as “king,” thanks in large part to the popularity of amateur videos and

fan websites, and the rush of advertising dollars to online video content.3 Google,

Microsoft, Apple, Cisco, Verizon, and many other technology companies apparently

agree with these sentiments, spending billions of dollars on fiber-optic networks, massive

data centers, and robust hardware and software platforms to deliver video over the

Internet. While their technologies and business models are often in direct competition,

there seems to be widespread consensus that the Internet will evolve into some sort of

universal cable channel that showcases all kinds of video — from brief amateur video

clips to Hollywood films — to potentially everyone with broadband Internet access,

whenever and nearly wherever they choose. In such an environment, goes the reasoning,

text, audio, still images, and everything else will play secondary roles.

1 Mitchell Stephens, The Rise of the Image, the Fall of the Word (New York:

Oxford University Press, 1998), 164.

2 Stephens, 171.

3 Bambi Francisco, “Net Sense: The Future of the Internet,” MarketWatch.Available from http://www.marketwatch.com/news/story/net-sense-future-internet-video/story.aspx?guid=%7B6115530A-15F3-4FDC-B7F6-55FB493D356E%7D.

2

I would like to offer an alternative to this video-centric vision outlined by

Stephens and others. While video is a compelling medium that may one day rival text-

based websites in popularity, it will not dominate the Internet for long. I will argue that

another type of content — one that shares video’s visual appeal, yet currently falls into

the “everything else” category — will eventually overshadow video. That content will

consist of sophisticated computer-generated environments, delivered in a variety of

formats and serving many different types of customer needs, including entertainment,

news, and community. These formats will use advanced computer graphics to deliver

photorealistic, three-dimensional representations of real and imagined spaces to a vast,

online audience, and allow audience members to interact with these environments and

each other in ways that are not possible with video.

Video — which I define as television, film, home movies, and any other moving

images derived from the movements of lit subjects and scenery in front of a camera lens

— was the dominant visual mass medium of the 20th century. It has had a profound

impact on society and world history, as evidenced by the power of moving images to

educate, propagate, agitate, inform and entertain. Stephens called video humankind’s

“third major revolution,” after writing and print.4

Indeed, many of the major events and societal trends of the last century were

shaped by this mass medium. Charlie Chaplin, Al Jolson, and Lillian Gish can be

considered among the first international superstars, beloved by tens of millions across all

social classes and in many countries all over the world, thanks to their leading roles in

Hollywood films in the teens and 20s. Stardom was not unknown before film, but pre-

4 Stephens, 11.

3

mass media musicians, actors, orators and authors were restricted to live performances

and personal appearances, which limited their popularity. Film made it possible for actors

to simultaneously reach millions of people in cities and towns across America, and for

performances to be watched over and over again. The impact on the public was

tremendous.

Politicians similarly expanded their audiences and platforms using the power of

moving pictures. The rise of Adolf Hitler and the Nazi party in Europe in the 1930s was

partially due to the influence of Leni Riefenstahl’s Triumph of the Will and other

propaganda films that promoted core Nazi beliefs while casting Jews and other groups in

harshly negative terms.5 John F. Kennedy’s political rise has been linked to an uneven

televised presidential debate with Richard Nixon in 1960,6 and his death in a Dallas

motorcade — captured on an 8 mm film camera by a bystander named Abraham

Zapruder — sparked a national sense of mourning. Nearly three decades later, another

amateur video showing of a group of police officers beating a black taxi driver named

Rodney King on a Los Angeles street eventually led to several days of deadly urban riots

across the U.S.

Besides changing the course of history, video has come to govern our daily lives,

and serves as an important means of understanding our world. While film was a just a

fringe entertainment in 1900, it became a regular part of public life within a few decades.

5 Elliot Aronson and Anthony Pratkanis, Age of Propaganda: The Everyday Useand Abuse of Persuasion (New York: Henry Holt, 2001), 323.

6 “1960: Kennedy-Nixon Debates.” Electronic Government Project, EagletonDigital Archive of American Politics, Rutgers University. Available fromhttp://www.eagleton.rutgers.edu/e-gov/e-politicalarchive-JFK-Nixon.htm.

4

By 1921, one source estimated annual U.S. box office receipts totaled $850 million

dollars, and the film industry was supporting hundreds of thousands of jobs.7 Television

also made rapid inroads, expanding from just seven thousand sets nationally by the end of

World War II to ten million receivers in 1950.8 Comedy, dramas, rebroadcast films and

other entertainment formats were not the only popular types of television programming.

Generations of children have been raised on a regular diet of educational programs and

cartoons, and television news became one of the primary sources of news, rivaling the

popularity of newspapers and magazines. As recently as December 2005, a survey of

American consumers found that 59% got news the previous day from local television and

47% from national television, compared to 44% from radio, 38% from a local newspaper,

23% from the Internet, and 12% from a national newspaper.9

Clearly, video continues to have a strong hold over audiences. Its ability to show

events, tell stories, and faithfully reproduce the words and actions of living beings gives it

an advantage over text-based formats such as printed periodicals, books and blogs.

Stephens also noted video’s ability to take viewers “elsewhere,” thanks to the way they

dominate the input to our eyes and ears:

7 “Revolutionary Talking Movies: Widespread Changes That Are Predicted IfNew Invention Is a Success — Elimination of Numerous Stars.” The New York Times,September 10, 1922. Available from http://query.nytimes.com/gst/abstract.html?res=9F07E5DE1F3AE433A25753C1A96F9C946395D6CF

8 Stephens, 46.

9 John B. Horrigan, “Online News: For many home broadband users, the Internetis a primary news source.” Pew Internet and American Life Project, March 22, 2006.

5

We misunderstand moving images when we think of them merelyas a form of communication, a type of entertainment, a means ofinformation or an art form. Perhaps books, newspapers or radio cansqueeze under such headings. Moving images with sound, because theyoccupy both of our major senses, cannot. They are more than that. Theyare a place we go.10

Stephens outlined a bright future for video in his 1998 book, The Rise of the

Image, the Fall of the Word. According to his thesis, television and film throughout the

20th century was generally unoriginal.11 He said that video needed to be reinvented in a

way that would enhance its strengths and eventually make it the pre-eminent medium for

telling stories and conveying information, even complex information that has

traditionally been the realm of print discourse.12 “... Once we move beyond simply

aiming cameras at stage plays, conversations, or sporting events and perfect original uses

of moving images, video can help us gain new slants on the world, new ways of seeing,”

he said. “It can capture more of the tumult and confusions of contemporary life than tend

to fit in lines of type.”13

The “new video” outlined in Rise of the Image, Fall of the Word incorporated

some of the techniques developed by avant-garde filmmakers and directors working with

music videos and television commercials, as well as new conventions and technologies

envisioned by Stephens. Juxtapositions, fast cutting, densely packed imagery, new

symmetries, an “excess of perspectives,” musical structure, new symbols and forms of

10 Stephens, 124-125.

11 Stephens, 91.

12 Stephens, 179.

13 Stephens, 18.

6

representation, surrealism, and computer graphics would characterize new video.14 The

tastes and preferences of audiences, he continued, would evolve to embrace new video,

while spoken languages and the printed word would “increasingly be a less precise, less

subtle language — one designed for use with images.”15

In his new video paradigm, Stephens described the importance of computing

technologies. Graphics would play central design and production roles. Computer-

generated imagery would be used for transitions, charts, and creative expression that

would allow directors to express their artistic visions and sense of fantasy, while

emphasizing the juxtapositions that he felt were so crucial to new video.16 After

production was completed, computers would serve as a more effective distribution

medium than movie theaters, terrestrial television and cable. The “network computer” —

in the form of larger computers situated in people’s living spaces, or portable wireless

devices, would serve as the primary conduit for anywhere, anytime video:

… As more and more video is produced, with the mass marketingof digital video editing, and as more and more video is stored in databasesand accessed on Web-like networks, it seems inevitable that the screens ofthose network computers will be filled much of the time with movingimages. … A whole range of full-screen video — tracked down inarchives, discovered through hypertext (hypervideo?) forwarded byfriends, crafted by artists, assigned by professors.17

14 Stephens, 182-199.

15 Stephens, 209.

16 Stephens, 196-197.

17 Stephens, 172.

7

Stephens accurately predicted current technologies such as viral video, YouTube,

and the video iPod. The character of video has been slower to evolve in the direction he

predicted, but it may some day come close to matching his vision.

However, Stephens and many of the other boosters who have predicted the

eventual dominance of video on the Web have failed to adequately address the inherent

conflict between the two mediums. In the video world, the director and others behind the

camera lens tell linear stories. The audience watches the screen passively. Stephens

readily accepted this drawback, noting “we have absolutely no influence whatsoever, free

or otherwise, on anything that transpires. Movies and television shows proceed entirely

without us.”18

The Internet, in contrast, is optimized for interactivity. It is a massive, distributed

computer network that was originally envisioned in the late 1960s as a robust

communications and file-transfer tool linking geographically dispersed computers and

networks.19 In the 1970s and 1980s, Internet traffic consisted of data transfers, text

messages, and relatively simple games. The audience was mostly limited to a small

population of computer-savvy users who had a connection through work, school, or one

of the early commercial service providers. By early 1993, there were just three to four

million Internet users worldwide, and only several tens of thousands of network nodes.20

18 Stephens, 126-127.

19 Gina Smith, “Unsung innovators: Robert Kahn, the ‘stepfather’ of the Internet.”Computerworld, December 3, 2007. Available from http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9046801.

8

That would shortly change. In 1994, the World Wide Web — a set of protocols

and technologies that sit on top of the Internet — leapt out of research labs and college

campuses, sparking a communications and media revolution. Instead of typing in text

commands to access information or send messages, people were presented with a screen

full of information and options. A Web page might consist of text, photographs, and

music. Almost all pages were linked with at least one other page. Pages could also access

software applications, databases, and other computing resources. Users could navigate

these interconnected pages with a Web browser and a mouse. This greatly simplified

access to the Internet, and opened up Internet-based content to mainstream audiences.

Content quickly moved beyond static pages containing text, links and

photographs. On some sites, pages contained forms that allowed people to input text and

software commands. This enabled developers to create front-end interfaces to back-end

databases and other networked resources, which in turn gave users access to online

discussion forums, search engines, online shopping, and a variety of registration-based

services. In other words, the audience was not just limited to looking at content. They

could react to it, respond to it, alter it, and discuss it or, for that matter, anything else on

their minds.

The impact of the Web on the Internet, mass media, business, and society has

been enormous. As of late 2007, nearly half of all Americans had high-speed Internet

connections at home. Many were using the Internet for social interactions, publishing

blogs or personal web pages, or looking for information in order to make important life

20 Bruce Sterling, “Short History of the Internet.” The Magazine Of Fantasy And

Science Fiction, February 1993. Available from http://w3.aces.uiuc.edu/AIM/scale/nethistory.html.

9

decisions, such as making a major investment, getting career training, choosing a school,

or helping someone with a health-related decision.21 Among young people, the Internet

has emerged as a central tool for socialization and interaction with friends. According to

survey data released by the Pew Internet and American Life Project, 55% of Americans

aged 12-17 have created a profile on Facebook, MySpace, or another social networking

website. Almost all of the teens that use these sites say they do so to keep in touch with

friends (including those whom they often see in person) and to make plans with them.

The conversations extend to other Web services. Twenty-eight percent of teens say they

blog, and many of them — especially those who already have social networking profiles

— like to leave comments on others’ blogs. Even posting a video or digital photograph

“often starts a virtual conversation” through the commenting features of such services.22

This trend points to the fact that the Web is more than just on-demand distribution

channel for video. The Internet lets audiences and organizations link, categorize,

comment on, rate, map, tag, buy, sell, market, and edit video in ways that very few

people imagined just five years ago.

Moreover, the Internet has eroded the control of the television and film industries

and traditional “gatekeepers” who work for them — writers, reporters, editors, publicists,

publishers, etc. The decline of industry power and control goes beyond the ability of

viewers to visit online forums to criticize a television news network’s supposed political

bias, read film blogs written by unpaid amateur film critics, or apply descriptive

21 John B. Horrigan, “Broadband: What’s All the Fuss About?” Pew Internet &American Life Project, October 18, 2007.

22 Amanda Lenhart, Mary Madden, Alexandra Rankin Macgill, Aaron Smith,“Teens and Social Media.” Pew Internet & American Life Project, December 19, 2007.

10

del.icio.us tags to the official websites of Hollywood stars. Members of the public have

been transformed into an army of self-propelled video producers with an international

audience, thanks to easy-to-use software tools, cheap consumer electronics, and the

widespread availability of broadband Internet connections. The video they produce tends

to consist of home movies, simple dramas and humor, and amateur recordings of live

events. However, some of the content is entertaining enough to attract large numbers of

viewers. On a recent evening, the dozen featured videos on the front page of YouTube

had between 135,530 and 2,237,096 views apiece.23 Other content, while amateurish, is

compelling to small numbers of people. An example are the home-made music videos

and live-action plot recreations based on the movie Cars, and readily available on

YouTube. These clips do not meet Hollywood production values, clearly violate

copyright law, and conflict with the marketing and public relations campaigns maintained

by Disney/Pixar, yet are a minor hit with a small group of fans who are starved for Cars-

related video content. The audience for an average amateur clip on YouTube might

consist of just a few dozen people — a manifestation of the so-called “Long Tail” niche

consumption pattern that characterizes Internet content.24 The strength of the Long Tail

becomes apparent when one considers the millions of clips that are available on YouTube

or other video-hosting sites. Most clips have a few dozen or few hundred views, but the

aggregate audience is actually quite large, numbering in the millions or tens of millions

23 135,530 views for “Drunk History vol. 1 - Featuring Michael Cera” and2,237,096 views for “The Original Human TETRIS Performance by GuillaumeReymond.” YouTube.com. Data gathered at 10:20 pm on January 7, 2008.

24 Chris Anderson, “Long Tail 101.” The Long Tail: A Public Diary of ThemesAround a Book. Available from http://www.longtail.com/the_long_tail/faq/index.html.

11

of people. Video created by the masses is now competing with the professional video

produced by entertainment industry.

In the news industry, amateur video footage provides a different sort of

competition. Members of the public not only happen to witness news, they often gain

access to people and places that broadcast news professionals cannot or will not see.

They are able to capture vivid, on-scene accounts of major and minor events. The

Zapruder and Rodney King home movies were early examples of this movement. Then,

the devices were relatively expensive and there was no way to distribute the video to a

wide audience, except through traditional media outlets such as television news. Now,

cheap webcams, video cameras and mobile phones with built in cameras make it possible

for practically anyone to record news events. The Internet lets them distribute the footage

to a huge audience, and lets them bypass traditional gatekeepers, their professional

editing requirements, and ethical codes. The footage they shoot is raw and real. It can be

brutally honest and compelling, but also provocative and biased. The December 2004

Indian Ocean tsunami was a watershed moment in this respect. For the first time, global

awareness of a major news event was shaped in large part by footage shot by amateurs

and distributed via the Internet. The footage was disturbing, but captured the scope of the

destruction far more effectively than broadcast news outlets, which had no reporters on

scene when the waves first struck the beaches.

Despite the rise of amateur video and the new modes of distribution and

discussion, the Internet and computer technologies have not been able to change the

fundamental character of video. Whether someone watches video on a television screen,

or plays it on YouTube, video is a linear, passive experience, designed to be watched

12

from beginning to end without alterations or input from the audience. For Web video,

interactivity is limited to tangential content — the text links in the navigation column, the

comment field below the Flash video player, the icon-based ratings systems, and the off-

site commentary on blogs and discussion boards. The video itself has none of these

features. Objects on the video screen are not linked. An audience member cannot easily

reshoot it, to make it more to his or her liking. What the viewer sees depends upon

whatever lit subject or scenery passed in front of the lens, and whatever creative choices

the people controlling the camera and editing the footage decided to apply. This has

always been the fundamental character of video. In this sense, a two-minute clip of an

Independence Day parade on YouTube is not much different than Fred Ott's Sneeze, an

1893 kinetoscope film produced by Thomas Edison showing one of his employees

sneezing.25

The failure of video, or new video, to move beyond a static, linear storytelling

device does not mean Web video is doomed. It has a healthy future, as experimentation

with formats continues and more members of the public learn to use cameras, editing

software, and Internet publishing tools. In addition, video is the best tool to accomplish

certain tasks, or tell certain stories — such as documenting nature, showing news events,

and recording living people. Video will also benefit from sophisticated applications that

use metadata, descriptive pieces of information assigned to individual pieces of content

by humans or software programs. For instance, a video clip stored on my computer may

25 Mary Hanlon, “Movie Audiences, Movie Myth: Early Cinema as Invention,Entertainment, Instruction.” Early Films, The Silent Western: Early Movie Myths of theAmerican West. Available from http://xroads.virginia.edu/~hyper/hns/westfilm/movie.htm

13

have metadata that identifies the make and model of camera used to shoot it, the date it

was created, and the dimension of the frame in pixels. I may further “tag” it with simple

descriptive labels that help me categorize it “home,” “kids,” and “Fido.” If I post it on the

Web, friends and family members may add their own tags: “funny,” “cute,” “Golden

Retriever.” This data may help other people’s Web searches and online activities — for

instance, someone searching for pictures of Golden Retrievers may find mine, and then

republish it on her blog post about cute Golden Retrievers. It is through the power of

metadata and tagging that a YouTube clip of a new electronic gadget can get hundreds of

thousands of views in just a few days.

Two emerging metadata applications that will help audiences more precisely find

and use video content are geotagging and autotagging. Geotags are geospatial data in a

file or software program that identify the location of some object, such as a building or

person. Some cameras with built-in Geographic Positioning System (GPS) devices can

automatically geotag images, which can later be associated with addresses on a map or

searched more effectively (“find all photos taken in the 02166 area code”). Autotagging

is the automatic application of descriptions to a piece of content, without human review.

Penn State researchers Jia Li and James Z. Wang have developed software that can be

trained to automatically recognizes the objects in images, and apply metadata to them.26

Once this technology is applied to moving images, it will be possible to more effectively

organize video content and design online applications that let people view and use video

in a profoundly different manner than we use it now. This goes beyond simply entering a

term in a search engine and finding the most closely matching videos; it will enable video

26 Jia Li and James Z. Wang, Automatic Linguistic Indexing of Pictures - Real

Time. Available from http://www.alipr.com/.

14

to be more precisely integrated into the other software applications that will operate in

our homes, classrooms, and places of work. Imagine generating a personalized report for

a family trip to the zoo that displays recent amateur video of new exhibits and live

camera feeds of the traffic situations along two potential routes. Metadata will make this

possible.

Nevertheless, I believe a family of graphics technologies will eventually

overshadow video and realize the true interactive potential of moving images accessed

via the Internet. The technologies employ three-dimensional computer-generated

environments. These environments are not science fiction, or obscure laboratory

experiments — they are already widely used in certain industries, as well as in the home

and over the Internet. In addition, they rival video for clarity and visual beauty, allow

creative options not possible with video, can be customized according to audience

preferences and situational factors, and can enable social interaction, cooperation, and

competition. In the coming years, new formats and tools will be made available to

audiences and content creators, further accelerating the adoption of computer-generated

environments and ensuring its dominance over other Internet media formats.

What are computer-generated environments? “Virtual reality” is an alternate term

that many people know, but I am reluctant to use it here. It carries with it several

misleading connotations, and does not necessarily include some of the formats that I

believe will play an important role in the future. The concept dates to the 1950s and

1960s, when Ivan Sutherland envisioned computer technologies being used to “render

sensations that would seem real to their recipients.”27 Programmer Jaron Lanier coined

15

the term “virtual reality” in the 1980s, but admits that virtual reality is a “somewhat broad

idea” with no fixed definition.28 Thanks to a wave of media hype and imaginative

Hollywood fictionalizations in the early 1990s, many members of the public associate

virtual reality with special goggles and wired gloves that allow people to manipulate

virtual objects in a digitally rendered, three-dimensional space. However, this excludes

other 3D environments that do not require gloves and headgear. Edward Castronova

additionally noted that “virtual” and “reality” are themselves loaded terms when used to

describe simulated environments that are driving “real” experiences based on artificial

sensory input. In his book, he avoided this semantic minefield, instead using the term

“synthetic worlds” to describe persistent, interactive 3D spaces simultaneously accessed

by large numbers of people.29

I prefer “computer-generated environments.” It is more inclusive than “synthetic

world” or “virtual reality.” It encompasses any graphics technology that displays

computer-generated 3D representations of real, simulated, or imagined environments, and

allows users to control motion, perspective, and elements within these environments. It

also avoids the semantic issues that Castronova described. However, computer-generated

environments do not include certain types of computer-generated imagery (CGI) and 3D

effects, such as static 3D images (e.g., a 3D model of a car engine displayed as a still

27 Edward Castronova, Synthetic Worlds: The Business and Culture of Online

Games (Chicago, The University of Chicago Press: 2005), 287.

28 Janice J. Heiss, “The Future of Virtual Reality: Part Two of a Conversationwith Jaron Lanier.” Articles and Tips, The Sun Developer Network. Available fromhttp://java.sun.com/features/2003/02/lanier_qa2.html.

29 Castronova, 287-294.

16

image) or linear narratives made with 3D animation, such as Shrek, M&Ms television

advertisements, and some children’s television programs.

There are many examples of computer-generated environments in the workplace.

The U.S. military has been one of the most active users of such tools. Tank crews in the

early 1980s learned how to target their cannons using a simple 3D simulator based on the

popular Battlezone arcade game. Pilots have used flight simulators for decades, and the

Army offers a free, 3D video game called “America’s Army” over the Internet as an

interactive recruiting advertisement and simple training tool.30

In the business world, computer-generated environments are widely used in

architecture and industrial design, as well as in several science-related fields. Since the

1980s, the drafting program AutoCAD has been used to design buildings, vehicle parts,

and other products, with current versions supporting “3D walkthroughs” and various

methods of viewing objects from multiple perspectives.31 General Motors used AutoCAD

and several other construction-oriented 3D modeling applications to build a 2.4-million

square foot plant in Lansing Delta Township, Michigan. The software tools helped GM

complete construction 5% to 8% under budget and 25% ahead of schedule, by letting

architects, builders, and plant managers plan the layout of the facility and all of the

equipment and infrastructure before the foundation was poured.32

30 Harold Kennedy, “Computer Games Liven Up Military Recruiting, Training.”National Defense, November 2002. Available from http://www.nationaldefensemagazine.org/issues/2002/Nov/Computer_Games.htm.

31 Shaan Hurley, Unofficial AutoCAD History Pages. Available fromhttp://myfeedback.autodesk.com/history/area51.htm.

17

Outside of the workplace, video games are the oldest and most popular type of

computer-generated environment used by the public. A significant portion of the

population has grown up with them, and among younger people — the so-called “digital

natives” who have never known life without personal computers, broadband Internet

connections, and 3D games — game play is pervasive. As noted by John Palfrey, not all

young people can be considered digital natives, but people who are “born digital” are

more likely to interact with such technologies as digital natives.33

Battlezone, mentioned earlier, let players control a tank in a simple 3D

environment consisting of green polyhedrons, a distant mountain range, and a never-

ending assault of enemy tanks. Other 3D games from the early 1980s let players wander

through dungeons or castles, killing monsters and gathering treasure. The graphics of

these games, while not sophisticated, introduced millions of people to computer-

generated environments and the concept of doing things — manipulating objects,

completing missions, and sometimes cooperating with others — in simulated, three-

dimensional spaces. In the mid-1990s, players were exposed to more sophisticated

graphics and networked play, either over local-area networks or the Internet. Another

important development during this period was the rise of “modding,” which let players of

3D game titles such as Doom modify characters and missions to suit personal

preferences, or make gameplay more interesting. Game studios or talented

32 Robert Mitchell, “Field Report: GM builds on 3-D model.” Computerworld,

September 11, 2006. Available from http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=112739.

33 John Palfrey, “Born Digital.” John Palfrey from the Berkman Center at theHarvard Law School, October 28, 2007. Available from http://blogs.law.harvard.edu/palfrey/2007/10/28/born-digital/.

18

player/programmers developed the modding software, which could be downloaded from

official game websites or fan sites.

Now, an estimated 38% of U.S. adults and 81% of children people ages four to 17

play video games,34 ranging from 3D games based on sports (Madden NFL ’08),

futuristic combat (Halo 3), and even real life (The Sims). While these games are popular

as single-player pastimes or entertainment for small groups of people, an interesting new

game format has begun to attract large numbers of players. Massively multiplayer online

role-playing games (MMORPG) allow thousands of geographically dispersed people to

simultaneously play in a persistent, shared, online world, usually built around a medieval

setting with lots of group campaigns and missions. These environments allow a high

degree of independence, creativity, and customization. In Battlezone, the player was a

standard tank, it was always night, and the starting level and location were always the

same. Now, a World of Warcraft player can choose his or her sex, race, class, continent,

gaming server, default language, guild, and numerous other variables — explained in

great detail in a 208-page guide.35 There are more than nine million active World of

Warcraft subscriptions.36

Second Life, a socially oriented virtual world accessed through the Internet, gives

“residents” even more extensive options to shape their own characters and in-world

34 Alexander Wolfe, “Who's The Child Now, Or Wii (Why) Most Adults Don'tPlay Video Games.” Wolfe’s Den, Information Week, December 2, 2007.

35 World of Warcraft Game Manual. Blizzard Entertainment, 2004.

36 “World Of Warcraft Surpasses 9 Million Subscribers Worldwide.” PressRelease. Blizzard Entertainment, July 24, 2007. Available from http://www.blizzard.com/press/070724.shtml.

19

experiences. Using simple 3D building tools, they can model buildings, clothing,

vehicles, furniture, landscapes, plants, animals, and other objects. If it is nighttime in

their part of Second Life, and they cannot see, they can “force sun” to make it daylight —

even if others around them still see the same nighttime features. They can also customize

the appearance of their “avatars,” or personal 3D characters. Changing one’s face to have

a big nose, red eyes, and a mullet involves clicking through a few menus and adjusting

sliders that control nose size, eye color, and rear hair length. A resident can even change

his or her head to that of a cat, dog, or other animal.37

Both World of Warcraft and Second Life encourage socialization and cooperation

through shared missions or shared interests, whether it is conquering a monster-filled

cave system in World of Warcraft and splitting the treasures within, or building shops and

other virtual facilities for a Brazilian community in Second Life. In most game-oriented

virtual worlds, it is impossible to reach certain areas of the gaming world and achieve

high point levels without cooperating with other players and developing teams that most

effectively draw upon the various skills of different types of players. In Second Life,

ambitious building projects require groups of avatars, and enjoyment is often derived

from interaction with friends and strangers. As with the text-based Internet, interaction in

virtual worlds is not required, but it makes for richer and more rewarding social

experiences. In addition, the mechanics of socialization in these worlds parallel the tools

used in the text-based Internet, such as buddy lists and simple text messages. For

someone who has already been exposed to 3D games, instant messaging, and social

37 These avatars are referred to as “furries.”

20

networking, it is not difficult to make the leap to using an avatar, communicating in

group chats, and joining a guild in an MMORPG.

The 3D graphics for World of Warcraft look cartoonish, and Second Life’s

graphics look even more primitive — avatars move stiffly, textures look blurred, and

walls and other features often do not render at peak times or in locations where lots of

avatars congregate. These issues will gradually disappear as the technical infrastructure

of such services improves, and more advanced 3D hardware and software enters the

marketplace. Moore’s Law, a hypothesis put forth by Intel engineer Gordon Moore in

1965, stipulates that the number of transistors on a chip will double every two years.38 It

was originally envisioned for predicting the increase in the power of computer processing

units (CPU), but can be applied to advances in the abilities of graphics processing units

(GPU) produced by specialized manufacturers such as nVidia. Every few years a new

generation of CPUs and GPUs is released to market, increasing the processing power of

desktop computers, gaming consoles, and portable devices. These advances allow game

designers to strive for the Holy Grail of the gaming industry — achieving advanced 3D

effects that approach photorealism:

The goal for many developers was now to create an experienceidentical to reality: rippling waters, flowing hair, shifting wind, dynamicmoving lights, reflections on moving objects, facial lip syncing, variedcharacter animation and emotions, and real physics and collisions.39

38 Gordon E. Moore, “Cramming More Components Onto Integrated Circuits.”Electronics, Volume 38, Number 8, April 19, 1965.

39 John Hight, Jeannie Novak, Game Development Essentials: Game ProjectManagement (Clifton Park, NY: Thomson Delmar Learning, 2008), 17.

21

The drive to photorealism in computer-generated environments potentially

involves sampling real-life objects. This is already done for 3D textures — instead of

painstakingly recreating the rough ochre color of a brick, a designer can take digital

photographs of the six sides of a brick and map them onto a 3D mesh in a software

application. There are also technologies for capturing real-life actions, such as human

movements, and applying them to models in 3D animation or computer-generated

environments. Microsoft is now developing software called Photosynth that pastes

geotagged photographs of a building or object onto a 3D model associated with the same

geospatial coordinates. An application called Fotowoosh turns 2D pictures into simulated

3D images. Such applications open up the possibility of computer-generated

environments or game worlds that mirror real-world places and people.40

Another gaming technology that should be considered in any discussion about the

development of computer-generated environments is the narrow application of artificial

intelligence used to drive the behavior of monsters, enemies, and non-player characters

(NPC) that populate video games. For years, game AI has been based on programmed

logic — e.g., if an avatar in World of Warcraft opens a certain dungeon door, a troll will

launch an attack. In recent years, developers have been experimenting with more

complex game AI that actually “learns” from environmental variables, or is trained by

observing the behavior of human players. Jeff Orkin, a game developer and researcher at

the MIT Media Lab, has developed an online 3D game called The Restaurant Game that

teaches a game AI how to interact with human players, by recording the interactions of

40 Ian Lamont, “Transforming 2D photos into 3D models.” The Digital MediaMachine, Computerworld, April 24, 2007. Available from http://blogs.computerworld.com/node/5418.

22

thousands of real human volunteers playing the game online. Orkin says that this

technology can potentially be applied to virtual worlds, as a way to make the actions of

NPCs more realistic to human players or residents.41

Besides gaming and virtual worlds, another popular application of computer-

generated environments involves simulations of buildings and representations of real-

world locations. It is now possible for potential homeowners to “tour” a 3D simulation of

a condominium development. Millions of vehicles in the United States have small

computers that capture location data from GPS satellites, and display a live, three-

dimensional representation of their locations and nearby streets. Google Earth, a software

program that uses geospatial information, satellite images, and 3D graphics, lets users

simulate flying over or through cities and geographical features. Google Earth users can

also geotag two-dimensional photographs, and map them on a corresponding Internet-

accessible 3D map. As of mid-2006, an estimated 72 million Americans had taken

“virtual tours” of another location online, with more than five million taking such tours

on a typical day.42

The computer-generated environments described above, and the functionality

available to users within them, are impossible to recreate with standard video

technologies. And why should they? Computer-generated environments and video are

oriented toward different applications. However, this may soon change, as the digital

41 “The Restaurant Game: New forms of Artificial Intelligence for ImmersiveEducation.” Jeff Orkin, MIT Media Lab. Presented at Immersive Education Day, HarvardInteractive Media Group/Harvard Graduate School of Education, December 8, 2007.

42 Xingpu Yuan and Mary Madden, “Virtual Space is the Place.” Pew Internet &American Life Project, November 27, 2006.

23

natives begin to enter maturity, 3D graphics achieve photorealism, and new Internet-

based software tools open up an expanded universe of online experiences that overlap

with those currently provided by video. Audiences and content creators will discover that

computer-generated environments can not only duplicate many types of video

programming, but also can provide customization, interactivity, and even social options

that amplify the ability of moving images to entertain and inform.

In recent years, there have been a number of experiments that indicate the

direction in which computer-generated environments are heading, and how they will

compete with and eventually displace video. Machinima — short for “machine

animation” or “machine cinema” — is one example. It involves the use of 3D animation

tools to make dramas, music videos, and other entertainment-oriented content.

Professional CGI and 3D animation tools have been used by Hollywood studios for

decades, but machinima is largely a grassroots phenomenon that relies upon inexpensive

technologies to create and distribute content. The content creators are individuals or small

teams, the tools are free or cheap game modding engines or games, and the distribution

platform is usually the Internet.

One example is Red vs. Blue. Starting in 2003, and ending 100 episodes later in

2007, a small team of writers and programmers used the Halo game engine to create a

comedy series depicting the hapless antics of two opposing squads of soldiers.43 The

humor was juvenile, the voice actors were amateurs, and the 3D graphics were simple,

but the series became a cult hit on the Internet and was eventually distributed via DVD.

43 “Red vs. Blue: A Machinima Series Based on Halo.” Available fromhttp://rvb.roosterteeth.com/home.php.

24

Another machinima, The French Democracy, was created in 2005 by Alex Chan, a

French industrial designer who had no previous experience with video production. He

wanted to explain the causes of the urban riots that tore through France that summer, and

he used The Movies — a $70 PC game — to create a drama that described the conditions

and factors that he believed were responsible for the riots. The quality of the animation

was primitive, and Chan had to rely on subtitles and music instead of voice actors for

audio, but the message was powerful. The 13-minute long clip was downloaded by tens

of thousands of viewers44 and generated a great deal of mainstream press reaction.

Machinima has barely made an impact on public awareness, but that will

eventually change as the quality of the graphics in machinima productions approaches

photorealism, high-quality synthetic speech synthesizers are developed, software tools

improve, and amateur writers/content creators become more skilled at scripting and

programming.

Further, while current machinima are like video in that they tell linear narratives,

the descendents of this technology will allow customization and interaction. For instance,

a machinima might let viewers preselect the appearance of the avatar stars, the sounds of

their voices, the location of the dramas, and other plot elements. So, I may opt to watch a

soap opera machinima in the default mode — a standard plot involving a love triangle

between two men and a woman in Los Angeles. However, another viewer may want to

see a love triangle with two women and a man in a small town in the Rockies, change the

name of the lead male character to “Walter,” set the appearance of both of the women to

44 The French Democracy had 31,102 views in QuickTime format, and 14,451views in Windows Movie format as of January 8, 2007, on the Machinima.com website.

25

blondes, and restrict close-up shots to less than 3% of the total plot length. A third viewer

in Japan may transfer the story to Tokyo, and have all of the characters speaking in

Japanese. Such options will be possible with the more advanced development tools and

user interfaces.

Another possibility for future machinima is to let viewers bring their own avatars

into the story. Most 3D games already have a background story and a plot that players are

supposed to follow. With the exception of some MMORPGs, it is seldom possible to

deviate from the pre-arranged mission, let alone play a part in a love triangle. Flexible

game AIs and plot templates could make interactive machinima a reality. Conceivably,

groups of friends could join each other in a drama, helping to support the plot in some

way — for instance, distracting or disabling a character who seeks to harm the

protagonist. Or, a historical drama or documentary depicting the start of the American

Revolution could also serve as a virtual classroom that lets elementary school explore the

Boston of the 1770s. The application could also let students interact with the period

avatars, whose reactions would be partially driven by advanced game AIs.

In terms of news and documentaries, computer-generated environments cannot

replace compelling video footage of live events, natural phenomena, and recordings of

personal moments. However, computer-generated environments may be used for realistic

simulations when video is not available. For instance, they may let viewers see a two-car

accident from multiple angles — including the points of view from each of the drivers’

seats starting five seconds before the collision — based on the geospatial data gathered

from the police report and other sources. Or, they could let students in an astronomy class

see a simulated asteroid impact on the moon in visible light or infrared light, from 50,

26

100, or 1,000 kilometers away. Author and inventor Ray Kurzweil predicts computer-

generated environments may one day be overlaid upon our real-world views, through

eyewear that displays text, icons and other information corresponding to objects in our

field of view:

… If you look at someone, little pop-ups will appear in your fieldof view, reminding you of who that is, giving you information about them,reminding you that it’s their birthday next Tuesday. If you look atbuildings, it will give you information, it will help you walk around. If ithears you stumbling over some information that you can’t quite think of, itwill just pop up without you having to ask.

The “augmented reality” described by Kurzweil would reduce our dependence

upon information delivered through computer monitors and small liquid crystal displays

on mobile phones. It would also rely upon geotags, facial recognition software, speech

recognition technologies, and a brain-machine interface that lets people input information

or commands into these systems without speaking or pressing keys.

Computer-generated environments could also replace live human anchors and

newsrooms. Most anchors simply read from scripts that either describe the footage that is

being shown on the screen or introduce segments by reporters in the field. This method of

presenting news is expensive and inflexible. Anchors are expensive. They can only work

at pre-arranged times throughout the day. They can get sick. Some viewers do not like the

appearance of a certain anchor. Avatars and software can remedy these shortcomings, and

allow a viewer to customize the appearance of his or her anchor, the type of news the

anchors narrates, and the time the newscast starts and finishes. Developers at

Northwestern University’s Intelligent Information Laboratory have created a prototype

27

application called News At Seven that features an avatar anchor reading news from eight

different categories — Business, Entertainment, Health, Politics, Science, Tech, U.S., and

World news. News At Seven is delivered over the Web, and is automated. Scripts, still

images, and video are pulled from other online news sources.45 It can be launched at any

time of the day or night. Until October of 2007, News At Seven used avatars from the

Half Life 2 game engine, but the high processing requirements associated with generated

a talking, 3D avatar on demand forced the designers to switch to simple, 2D avatars for

the limited beta launch of the application.46

In the future, similar news applications could allow 3D avatars to be customized

to mimic real news anchors (Walter Cronkite, Katie Couric, Jack Williams), other real

people (someone’s father, a favorite teacher, a politician), characters based on a set of

self-selected attributes, or one’s own avatar. The avatars might be seated in a simulated

newsroom, or could be moved to a computer-generated environment that mirrors the real-

life location where the news that he or she is describing took place. The environment

might be based upon geotags and other metadata that were generated by the original

reports and video footage. The news itself can also be fine-tuned, based on specific

categories, locations, times, and keywords chosen by the viewer. I may choose to have

the first half of my newscast consist of developments relating to the New York Stock

Exchange in the previous 24 hours. For the second half, I may restrict my anchor to

45 Kate Goodloe, “Broadcast News Goes Human-Free.” The Wall Street Journal,January 6, 2007. Available from http://online.wsj.com/public/article/SB116803755568668612-7IG7wBl1Wpezld0friGmB0x1ONM_20070113.html.

46 “News at Seven Beta Launch!” News At Seven Blog, October 29, 2007,Available from http://newsatseven.com/blog/?m=200710.

28

reading reports that mention “China” or “Beijing” in the lede and have accompanying

video footage sourced from any clip taken in Beijing or Shanghai within the past six

hours. Detailed metadata would be crucial to creating such a report.

Approximately 10 years ago, Stephens prophesied a mass media environment that

would be increasingly dominated by video. Noting the failure of CD-ROMs and other

early interactive video technologies such Time Warner’s Qube,47 he foresaw computers

playing largely supportive roles, such as adding graphical flavor and creating distribution

channels for new video. Even in the current Web 2.0 age, characterized by text-based

media such as social networking websites and blogs, many observers still believe video

will eventually triumph, thanks to its solid broadcasting track record, strong advertising

revenue, and the popularity of online video. Other experts acknowledge that computer-

generated environments will be important, but many are unsure what such formats will

look like or how people will use them.48

While predicting the future is difficult, it is possible to identify trends based on

quantitative research and an understanding of recent developments in computer software,

hardware, and networking technologies. I believe many of the predictions outlined above,

far from being the realm of science fiction, provide valid insights into the future of mass

media. Computer-generated environments and other Internet technologies will not only

change the ways in which we interact with each other, they will change the way in which

we see our world.

47 Stephens, 169, 174.

48 Janna Quitney Anderson, Lee Rainie, “The Future of the Internet II.” PewInternet & American Life Project, September 24, 2006.

Video, Computer-Generated Environments and the Future of the Web

Documents

Transcript of Video, Computer-Generated Environments and the Future of the Web