Experience a Glimpse of 3D Web Browsing

Coming Soon: 3D computing. Well, it makes sense, doesn’t it?

3D Windows XP Icons
image credit: http://goo.gl/98PXI

My claim is that 3D is the next step in object-oriented user interface (OOUI), which is the way most of us interact with computers after someone (at Apple, I think) had idea that we’d store ‘documents’ in ‘folders’ rather than access them via a command line. Ever since, we’ve  been using ‘object-oriented’ analogies to interact with our machines.

Now is the age of 3D screen technologies, with Hollywood fighting back from piracy with a new golden age for cinema, Samsung outperforming Sony to becoming the number one manufacturer of 3D TVs, and the Nintendo 3DS making use of prismatic 3D in it’s menus, and of course in-game (think I might be buying Ocarina again soon). Not to mention Microsoft’s Kinect, which changes the way we interact in the three dimensions of physical, as opposed to virtual space.

But before all of this, there were innovators trying to make 3D compliant with everyday use, such as TATMobile who, without the power to print prismatic screens, force a behaviour change through the use of 3D glasses, or sell expensive stereoscopic 3D projectors, had come up with a pretty cool lo-fi solution:

The video above demonstrates the use of a front-facing camera on one’s mobile phone to track the location of your eyes, augmenting what’s onscreen, allowing you to see ‘behind’ icons or onto different screens by peering around. Hopefully you can imagine how a 3D screen might alter the way you interact with your device, so it’s no wonder they were bought by RIM and are now developing UI for BlackBerry.

While we’re at it, also check out the work of Bumptop (sadly now defunct), Johnny Lee‘s Wii hacks, and even YouTube‘s foray into 3D video.

Another lo-fi solution to making 3D useful comes from Mozilla, outlined in this fascinating article. Their technology, called Tilt, is not a way to physically see in 3D (it’s just software at this point), but certainly nods towards the future 3D stereoscopy web content. You can test Mozilla’s Tilt plugin in Firefox with their beta plugin at that link, but here’s a demo:

All we need now is for computer, laptop, tablet & mobile screens to become 3D-enabled, and for vast swathes of web designers to optimise their sites for WebGL, and suddenly those social buttons become a bit more clickable.

Applying McLuhan

I begin with McLuhan, whose Laws of Media or Tetrad offers greater insights for Mobile AR, sustaining and developing upon the arguments developed in my assessment of the interlinking technologies that meet in Mobile AR, whilst also providing the basis to address some of this man’s deeper thoughts.

The tetrad can be considered an observation lens to turn upon one’s subject technology. It assumes four processes take place during each iteration of a given medium. These processes are revealed as answers to these following questions, taken from Levinson (1999):

“What aspect of society or human life does it enhance or amplify? What aspect, in favour or high prominence before the arrival of the medium in question, does it eclipse or obsolesce? What does the medium retrieve or pull back into centre stage from the shadows of obsolescence? And what does the medium reverse or flip into when it has run its course or been developed to its fullest potential?”

(Digital Mcluhan 1999: 189).

To ask each of these it is useful to transfigure our concept of Mobile AR into a more workable and fluid term: the Magic Lens, a common expression in mixed reality research. Making this change allows the exploration of the more theoretical aspects of the technology free of its machinic nature, whilst integrating a necessary element of metaphor that will serve to illustrate my points.

To begin, what does the Magic Lens amplify? AR requires the recognition of a pre-programmed real-world image in order to augment the environment correctly. It is the user who locates this target, it is important to mention. It could be said that the Magic Lens more magnifies than amplifies an aspect of the user’s environment, because like other optical tools the user must point the device towards it and look through, the difference with this Magic Lens is that one aspect of its target, one potential meaning, is privileged over all others. An arbitrary black and white marker holds the potential to mean many things to many people, but viewed through an amplifying Magic Lens it means only what the program recognises and consequently superimposes.

This superimposition necessarily obscures what lies beneath. McLuhan might recognise this as an example of obsolescence. The Magic Lens privileges virtual over real imagery, and the act of augmentation leaves physical space somewhat redundant: augmenting one’s space makes it more virtual than real. The AR target undergoes amplification, becoming the necessary foundation of the augmented reality. What is obsolesced by the Magic Lens, then, is not the target which it obscures, but everything except the target.

I am reminded of McLuhan’s Extensions of Man (1962: 13), which offers the view that in extending ourselves through our tools, we auto-amputate the aspect we seek to extend. There is a striking parallel to be drawn with amplification and obsolescence, which becomes clear when we consider that in amplifying an aspect of physical reality through a tool, we are extending sight, sound and voice through the Magic Lens to communicate in wholly new ways using The Virtual as a conduit. This act obsolesces physical reality, the nullification effectively auto-amputating the user from their footing in The Real. So where have they ‘travelled’? The Magic Lens is a window into another reality, a mixed reality where real and virtual share space. In this age of Mixed Realities, the tetrad can reveal more than previously intended: new dimensions of human interaction.

The third question in the tetrad asks what the Magic Lens retrieves that was once lost. So much new ground is gained by this technology that it would be difficult to make a claim. However, I would not hold belief in Mobile AR’s success if I didn’t recognise the exhumed, as well as the novel benefits that it offers. The Magic Lens retrieves the everyday tactility and physicality of information engagement, that which was obsolesced by other screen media such as television, the Desktop PC and the games console. The Magic Lens encourages users to interact in physicality, not virtuality. The act of actually walking somewhere to find something out, or going to see someone to play with them is retrieved. Moreover, we retrieve the sense of control over our media input that was lost by these same technologies. Information is freed into the physical world, transfiguring its meaning and offering a greater degree of manipulative power. Mixed Reality can be seen only through the one-way-glass of the Magic Lens, The Virtual cannot spill through unless we allow it to. We have seen that certain mainstream media can wholly fold themselves into reality and become an annoyance- think Internet pop-ups and mobile ringtones- through the Magic Lens we retrieve personal agency to navigate our own experience. I earlier noted that “the closer we can bring artefacts from The Virtual to The Real, the more applicable these can be in our everyday lives”; a position that resonates with my growing argument that engaging with digital information through the Magic Lens is an appropriate way to integrate and indeed exploit The Virtual as a platform for the provision of communication, leisure and information applications.

It is hard to approximate what the Magic Lens might flip into, since at this point AR is a wave that has not yet crested. I might suggest that since the medium is constrained to success in its mobile device form, its trajectory is likely entwined with that medium. So, the Magic Lens flips into whatever the mobile multimedia computer flips into. Another possibility is that the Magic Lens inspires such commercial success and industrial investment that a surge in demand for Wearable Computers shifts AR into a new form. This time, the user cannot dip in and out of Mixed Reality as they see fit, they are immersed in it whenever they wear their visor. This has connotations all of its own, but I will not expound my own views given that much cultural change must first occur to implement such a drastic shift in consumer fashions and demands. A third way for the Magic Lens to ‘flip’ might be its wider application in other media. Developments in digital ink technologies; printable folding screens; ‘cloud’ computing; interactive projector displays; multi-input touch screen devices; automotive glassware and electronic product packaging could all take advantage of the AR treatment. We could end up living far more closely with The Virtual than previously possible.

In their work The Global Village, McLuhan and Powers (1989) state that:

“The tetrad performs the function of myth in that it compresses past, present, and future into one through the power of simultaneity. The tetrad illuminates the borderline between acoustic and visual space as an arena of the spiralling repetition and replay, both of input and feedback, interlace and interface in the area of imploded circle of rebirth and metamorphosis”

(The Global Village 1989: 9)

I would be interested to hear their view on the unique “simultaneity” offered by the Magic Lens, or indeed the “metamorphosis” it would inspire, but I would argue that when applied from a Mixed Reality inter-media perspective, their outlook seems constrained to the stringent and self-involved rules of their own epistemology. Though he would be loath to admit it, Baudrillard took on McLuhan’s work as the basis of his own (Genosko, 1999; Kellner, date unknown), and made it relevant to the postmodern era. His work is cited by many academics seeking to forge a relationship to Virtual Reality in their research…

The Internet

The Internet, or specifically the World Wide Web, requires a limited virtuality in order to do its job. The shallow immersion offered to us by our computer screens actually serves our needs very well, since the Internet’s role in our lives is to connect, store and present information in accessible, searchable, scannable, and consistent form for millions of users to access simultaneously, to be dived in and out of quickly or to surround ourselves in the information we want. The naturally-immersive VR takes us partway towards Mobile AR, but its influence stops at the (admittedly profound) concept of real-time interaction with 3D digital images. What the Internet does is bring information to us, but VR forces us to go to it.

This is a function of the Mixed Reality Scale, and the distance of each from The Real. The closer we can bring artefacts from The Virtual to The Real, the more applicable these can be in our everyday lives. The self-sufficient realm of The Virtual does not require grounding in physical reality in order to exist, whereas the Internet and other MR media depend on The Real to operate. AR is the furthest that a virtual object can be ‘stitched into’ our reality, and in doing so we exploit our power in this realm to manipulate and interact with these digital elements to suit our own ends, as we currently do with the World Wide Web.

The wide-ranging entertainment resources offered by the Internet are having a profound effect on real-world businesses, a state of flux that Mobile AR could potentially exploit. There is a shift in the needs of consumers of late that is forcing a change in the ways that many blue-chip organisations are handling their businesses: Mobile data carriers (operators), portals, publishers, content owners and broadcasters are all seeking new content types to face up to the threat of VOIP (Voice Over Internet Protocol) – which is reducing voice traffic; and Web TV/ Internet – reducing (reduced?) TV audiences, particularly in the youth market.

T-Mobile, for example, seeks to improve on revenues through offering unique licensed mobile games, themes, ringtones and video-clips on their T-Zones Mobile Internet Portal; NBC’s hit-series ‘Heroes’ is the most downloaded show on the Internet, forcing NBC to offer exclusive online comics on their webpage, seeking to recoup advertising revenue losses through lacing the pages of these comics with advertising. Mobile AR represents a fresh landscape for these businesses to mine. It is no surprise, then, that some forward-thinking AR developers are already writing software specifically for the display of virtual advertisement billboards in built-up city areas (T-Immersion).

The Internet has changed the way we receive information about the world around us. This hyper-medium has swallowed the world’s information and media content, whilst continuing to enable the development of new and exciting offerings exclusive to the desktop user. The computing capacity required to use the Internet has in the past constrained the medium to the desktop computer, but in the ‘Information Age’ the World Wide Web is just that: World Wide.

Virtual Reality

AR is considered by some to be a logical progression of VR technologies (Liarokapis, 2006; Botella, 2005; Reitmayr & Schmalstieg, 2001), a more appropriate way to interact with information in real-time that has been granted only by recent innovations. Thus, one could consider that a full historical appraisal would pertain to VR’s own history, plus the last few years of AR developments. Though this method would certainly work for much of Wearable AR- which uses a similar device array- the same could not be said for Mobile AR, since by its nature it offers a set of properties from a wholly different paradigm: portability, connectivity and many years of mobile development exclusive of AR research come together in enhancing Mobile AR’s formal capabilities. Despite the obvious mass-market potential of this technology, most AR research continues to explore the Wearable AR paradigm. Where Mobile AR is cousin to VR, Wearable AR is sister. Most published works favour the Wearable AR approach, so if my assessment of Mobile AR is to be fair I cannot ignore its grounding in VR research.

As aforementioned, VR is the realm at the far right of my Mixed Reality Scale. To explore a Virtual Reality, users must wear a screen array on their heads that cloak the user’s vision with a wholly virtual world. These head-mounted-displays (HMD’s) serve to transpose the user into this virtual space whilst cutting them off from their physical environment:

A Virtual Reality HMD, two LCD screens occupy the wearer's field of vision
A Virtual Reality HMD, two LCD screens occupy the wearer's field of vision

The HMD’s must be connected to a wearable computer, a Ghostbusters-style device attached to the wearer’s back or waist that holds a CPU and graphics renderer. To interact with virtual objects, users must hold a joypad. Aside from being a lot to carry, this equipment is restrictive on the senses and is often expensive:

A Wearable Computer array, this particular array uses a CPU, GPS, HMD, graphics renderer, and human-interface-device
A Wearable Computer array, this particular array uses a CPU, GPS, HMD, graphics renderer, and human-interface-device

It is useful at this point to reference some thinkers in VR research, with the view to better understanding The Virtual realm and its implications for Mobile AR’s Mixed Reality approach. Writing on the different selves offered by various media, Lonsway (2002) states that:

“With the special case of the immersive VR experience, the user is (in actual fact) located in physical space within the apparatus of the technology. The computer-mediated environment suggests (in effect) a trans-location outside of this domain, but only through the construction of a subject centred on the self (I), controlling an abstract position in a graphic database of spatial coordinates. The individual, of which this newly positioned subject is but one component, is participant in a virtuality: a spatio-temporal moment of immersion, virtualised travel, physical fixity, and perhaps, depending on the technologies employed, electro-magnetic frequency exposure, lag-induced nausea, etc.”

Lonsway (2002: 65)

Despite its flaws, media representations of VR technologies throughout the eighties and early nineties such as Tron (Lisberger, 1982), Lawnmower Man (Leonard, 1992) and Johnny Mnemonic (Longo, 1995) generated plenty of audience interest and consequent industrial investment. VR hardware was produced in bulk for much of the early nineties, but it failed to become a mainstream technology largely due to a lack of capital investment in VR content, a function of the stagnant demand for expensive VR hardware (Mike Dicks of Bomb Productions: personal communication). The market for VR content collapsed, but the field remains an active contributor in certain key areas, with notable success as a commonplace training aid for military pilots (Baumann, date unknown) and as an academic tool for the study of player immersion and virtual identity (Lonsway, 2002).

Most AR development uses VR’s same array of devices: a wearable computer, input device and an HMD. The HMD is slightly different in these cases; it is transparent and contains an internal half-silvered mirror, which combines images from an LCD display with the user’s vision of the world:

An AR HMD, this model has a half-mirrored screen at 45 degrees. Above are two LCDs that reflect into the wearer's eyes whilst they can see what lies in front of them
An AR HMD, this model has a half-mirrored screen at 45 degrees. Above are two LCDs that reflect into the wearer's eyes whilst they can see what lies in front of them

 

What Wearable AR looks like, notice the very bright figure ahead. If he was darker he would not be visible
What Wearable AR looks like, notice the very bright figure ahead. If he was darker he would not be visible

There are still many limitations placed on the experience, however: first, the digital graphics must be very bright in order to stand out against natural light; second, they require the use of a cumbersome wearable computer array; third, this array is at a price-point too high for it to reach mainstream use. Much of the hardware used in Wearable AR research is bought wholesale from liquidized VR companies (Dave Mee of Gameware: personal communication), a fact representative of the backward thinking of much AR research.

In their work New Media and the Permanent Crisis of Aura Bolter et al. (2006) apply Benjamin’s work on the Aura to Mixed Reality technologies, and attempt to forge a link between VR and the Internet. This passage offers a perspective on the virtuality of the desktop computer and the World Wide Web:

“What we might call the paradigm of mixed reality is now competing successfully with what we might call ‘pure virtuality’ – the earlier paradigm that dominated interface design for decades.
In purely virtual applications, the computer defines the entire informational or perceptual environment for the user … The goal of VR is to immerse the user in a world of computer generated images and (often) computer-controlled sound. Although practical applications for VR are relatively limited, this technology still represents the next (and final?) logical step in the quest for pure virtuality. If VR were perfected and could replace the desktop GUI as the interface to an expanded World Wide Web, the result would be cyberspace.”

Bolter et al. (2006: 22)

This account offers a new platform for discussion useful for the analysis of the Internet as a component in Mobile AR: the idea that the Internet could exploit the spatial capabilities of a Virtual Reality to enhance its message. Bolter posits that this could be the logical end of a supposed “quest for pure virtuality”. I would argue that the reason VR did not succeed is the same reason that there is no “quest” to join: VR technologies lack the real-world applicability that we can easily find in reality-grounded media such as the Internet or mobile telephone.

What is AR and What is it Capable Of?

Presently, most AR research is concerned with live video imagery and it’s processing, which allows the addition of live-rendered 3D digital images. This new augmented reality is viewable through a suitably equipped device, which incorporates a camera, a screen and a CPU capable of running specially developed software. This software is written by specialist software programmers, with knowledge of optics, 3D-image rendering, screen design and human interfaces. The work is time consuming and difficult, but since there is little competition in this field, the rare breakthroughs that do occur are as a result of capital investment: something not willingly given to developers of such a nascent technology.

What is exciting about AR research is that once the work is done, its potential is immediately seen, since in essence it is a very simple concept. All that is required from the user is their AR device and a real world target. The target is an object in the real world environment that the software is trained to identify. Typically, these are specially designed black and white cards known as markers:

An AR marker, this one relates to a 3D model of Doctor Who's Tardis in Gameware's HARVEE kit
An AR marker, this one relates to a 3D model of Doctor Who's Tardis in Gameware's HARVEE kit

These assist the recognition software in judging viewing altitude, distance and angle. Upon identification of a marker, the software will project or superimpose a virtual object or graphical overlay above the target, which becomes viewable on the screen of the AR device. As the device moves, the digital object orients in relation to the target in real-time:

armarker2
Augmented Reality in action, multiple markers in use on the HARVEE system on a Nokia N73

The goal of some AR research is to free devices from markers, to teach AR devices to make judgements about spatial movements without fixed reference points. This is the cutting edge of AR research: markerless tracking. Most contemporary research, however, uses either marker-based or GPS information to process an environment.

Marker-based tracking is suited to local AR on a small scale, such as the Invisible Train Project (Wagner et al., 2005) in which players collaboratively keep virtual trains from colliding on a real world toy train track, making changes using their touch-screen handheld computers:

crw_80271
The Invisible Train Project (Wagner et al., 2005)

GPS tracking is best applied to large scale AR projects, such as ARQuake (Thomas et al, 2000), which exploits a scale virtual model of the University of Adelaide and a modified Quake engine to place on-campus players into a ‘first-person-shooter’. This application employs use of a headset, wearable computer, and a digital compass, which offer the effect that enemies appear to walk the corridors and ‘hide’ around corners. Players shoot with a motion-sensing arcade gun, but the overall effect is quite crude:

100-0007_img_21
ARQuake (Thomas et al, 2000)

More data input would make the game run smoother and would provide a more immersive player experience. The best applications of AR will exploit multiple data inputs, so that large-scale applications might have the precision of marker-based applications whilst remaining location-aware.

Readers of this blog will be aware that AR’s flexibility as a platform lends applicability to a huge range of fields:

  • Current academic work uses AR to treat neurological conditions: AR-enabled projections have successfully cured cockroach phobia in some patients (Botella et al., 2005);
  • There are a wide range of civic and architectural uses: Roberts et al. (2002) have developed AR software that enables engineers to observe the locations of underground pipes and wires in situ, without the need schematics
  • AR offers a potentially rich resource to the tourism industry: the Virtuoso project (Wagner et al., 2005) is a handheld computer program that guides visitors around an AR enabled gallery, providing additional aural and visual information suited to each artefact;

The first commercial work in the AR space was far more playful, however: AR development in media presentations for television has led to such primetime projects as Time Commanders (Lion TV for BBC2, 2003-2005) in which contestants oversee an AR-enabled battlefield, and strategise to defeat the opposing army, and FightBox (Bomb Productions for BBC2, 2003) in which players build avatars to compete in an AR ‘beat-em-up’ that is filmed in front of a live audience; T-Immersion (2003- ) produce interactive visual installations for theme parks and trade expositions; other work is much more simple, in one case the BBC commissioned an AR remote-control virtual Dalek meant for mobile phones, due for free download from BBC Online:

A Dalek, screenshot taken from HARVEE's development platform (work in progress)
A Dalek, screenshot taken from HARVEE's development platform (work in progress)

The next entry in this series is a case study in AR development. If you haven’t already done so, please follow me on Twitter or grab an RSS feed to be alerted when my series continues.