Wednesday, October 30, 2019

The Raven Has Landed: What do Edgar Allen Poe and Glenn Close know about Linked Open Data?

The raven has landed! And just in time for Halloween.


                 "The Raven" by Ivan Misic is licensed under CC BY-NC-ND 4.0 via Creative Commons

Let me explain.

In our course at City, University London, we are exploring the phenomenon of creating a shared language on the web that computers can understand and speak. The idea is called Linked Open Data (LOD) and it is a component of what advocates call "The Semantic Web" (Floridi, for one, takes umbrage with this term and instead refers to it as the "Meta-Syntatic Web")(Floridi, 2009). One of the biggest advocates for establishing Linked-Open-Data is none other than Tim Berners-Lee, an English inventor who currently lives in Boston, Massachusetts, and the fellow who brought us the world wide web in the early 90s. In a much heralded 2009 TED talk, Berners-Lee explained some of the concepts of Linked Open Data (LOD). At one point, he even summoned his audience to scream out "Raw Data Now!"


                                                                Ted Talk Video, The Next Web, Ted2009

It was the battle cry of nerds everywhere.

To his credit though, LOD has the possibility of nudging the economy and scientific/aesthetic/humanitarian spheres into overdrive - that's because instead of data exchange having to rely on human-to-human contact on the web via HTML (think of all those hyperlinks you see in webpages and documents), the new method would be computer-generated automatic links that do the heavy lifting for us. 

As it stands now, much of the web's data is still couched in proprietary terms and trapped within the information silos of the earliest Web 2.0 pioneers like Facebook, Apple, LinkedIn, Instagram and Twitter. These behemoths, unsurprisingly, are slow to get on board and share their data (Amazon is actually one notable exception and has been opening it up its data through its Amazon Web Services). With the other big players, we are left with the dusty tools of HTML and HTTP, relying on human beings to make the connections and not computers.

Glenn Close and Easter?


                                                                     photo by Alan Light

You may have noticed since 2012 that Google (another big player that is making strides in opening up data) created their "Google Knowledge Graph," an informative panel of the right-hand side of the page accompanying popular searches; it reads much like a Wikipedia entry. This creation is in line with linked open data; it is a graph mostly generated by computers rather than by human efforts.

For linked-open-data to work, it requires a triple:

OBJECT -------- PREDICATE -------- SUBJECT

The Predicate, or relationship, is the most important part.

Let us first go to "The Raven" example:

The following slides with the black background are from an excellent video by OCLC, the non-profit Online Computer Library Center, entitled "Linked Data for Libraries."



                     Photo Credit: Thomas Kilduff, Slide of OCLC's Linked Data for Libraries Video



                     Photo Credit: Thomas Kilduff, Slide of OCLC's Linked Data for Libraries Video



In linked-open-data, it is important to remember that one of the prerequisites is that each person, item, or concept has a URI (Uniform Resource Identifier). This is distinct from the more familiar URL (Uniform Resource Locator). Think of a URI similar to a book's ISBN, the 13-digit International Standard Book Number that is unique to each book in print. Thus, "The Raven," by Edgar Allen Poe, has a different URI than the common, squawking, blackbird "raven" (although each are equally spooky). The LOD programming schemes are able to disambiguate between them. With a separate URI, linked-open-data initiatives do not have to think closely about context (OCLC, 2012).


                     Photo Credit: Thomas Kilduff, Slide of OCLC's Linked Data for Libraries Video

So just for fun, I wondered if for experiment's sake I could see if linked-open-data would find a connection between Glenn Close and Easter, a sort of parlor game of "Two Degrees of Separation." If Glenn Close had played an unstable femme fatale in the 1987 hit thriller "Fatal Attraction" and her character had done something unspeakable to a pet bunny rabbit, would she be connected to the Christian holiday of Easter since that is popularly represented (at least in the U.S.) by the famous Easter Bunny? 


                                         Photo Credit: Thomas Kilduff, Still of Google search "Glenn Close Bunny"


                                      Photo Credit: Thomas Kilduff, Still of Duck-Duck-Go search "Glenn Close Bunny"




Here you see that Google includes a Knowledge Graph when the search items of "Glenn Close and bunny" are paired but Duck-Duck-Go does not (although Duck-Duck-Go does include a knowledge graph when "Glenn Close" is searched alone). Neither  search engine provides a Knowledge Graph with the exotic pairing of "Glenn Close and Easter." Coincidentally, it looks like Glenn Close (whom I adore for the record) wore a periwinkle, tin foil-like dress to a premiere in Australia last April, so a critic from the Sydney Morning Herald wondered if it heralded a trend (Singer, 2019).


                            Photo Credit: Thomas Kilduff, Google Search "Glenn Close Easter"


                                                     Photo Credit: Jordan Strauss, AP

So why is all of this important?

Many hardworking individuals and organizations have been making LOD a reality at places called Schema.org and the VIAF (Virtual International Authority File). Librarians, especially, have been on the front lines as we specialize in taxonomies and ontologies. In fact, libraries of all kinds (medical, public, academic, legal) are already using linked data in order for our institutions to operate. 

Beyond that, the principle is about treating the internet and the world wide web as a shared resource, a utility rather than as a capitalist playground. Cooperation would flourish and those who want to cordon off their proprietary data would be banished from the marketplace and the conversation.

Personal Reservations

The idea of LOD is full of earnestness and idealism and is one in which hardworking individuals work across borders and languages. Participants include the national libraries of some three dozen different countries in places as diverse as New Zealand, Chile, Iceland, Lebanon, and Singapore (VIAF). If we can do that with coding languages, think of how we can share brainpower on other big-ticket agenda items like tackling inequality, poverty, disease, and climate change. 

But it must be said: Are we giving the VIAF too much power as a centralising force? Are triples subject to hacking and grafitti? What if a spurned ex-lover was somehow able to hack into the system and claim that YOU (Subject) ----- ROBBED (predicate) ---- BANKS (OBJECT). Would that action be tedious if not impossible to clean up?

Additionally, is there a place for humour and irreverence, and poetry (excepting Poe) in the world of linked data? Would our on-line society become too literal? Would it become a place of less creativity and become a breeding ground for what JK Rowling calls "Muggles?"

And if everything is limited by the predicate (or relationship) within the triple, does this limit the dynamism of the human experience? Like Walt Whitman once said, "I contain multitudes." Were he alive today, would he find LOD too stifling?

------


Resources

Floridi, L. (2009) 'Web 2.0 vs. the Semantic Web: A Philosophical Assessment' Episteme. Available at: https://uhra.herts.ac.uk/handle/2299/3629 (Accessed: 30 October 2019).

OCLC (2012) Linked Data for Libraries [YouTube Video]. Available at https://www.youtube.com/watch?v=fWfEYcnk8Z8 (Accessed: 30 October 2019).

VIAF: Virtual International Authority File. Available at https://www.viaf.org/ (Accessed: 30 October 2019)

Links (Chronologically) 

City, University London https://www.city.ac.uk/study/courses/postgraduate/library-science
Linked Open Data (LOD) https://www.w3.org/standards/
Tim Berners-Lee https://internethalloffame.org/inductees/tim-berners-lee
2009 Ted talk https://www.ted.com/talks/tim_berners_lee_on_the_next_web
HTML https://html.com/
Information Silo https://en.wikipedia.org/wiki/Information_silo

6 comments:

  1. Super post! Thank you for this very detailed analysis of LOD, with personal experiments to boot, collectively making a fascinating read. Your reflections towards the end I thought raised some very pertinent considerations, particularly about the principle of the internet and the web as a shared resource rather than a capitalist playground. Lets hope in this context LOD is used in a meaningful and cooperative way and not exploited to bring about negative consequences

    Jonathan Benedetti

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. The potential stifling of creativity is not a consequence of LOD that I had considered before and I think you raise an excellent point in that.
    “the principle is about treating the internet and the world wide web as a shared resource” - this resonated with me so much, in my opinion there have been too many moves (some admittedly with good intentions) to stifle the openness of the web and it’s so important we not forget why it exists in the first place. Especially since many societies are now so dependent on the web for their functioning it’s important to take time and consider the potential impact of any changes to it.
    The way you describe the principles of LOD and the semantic web has been helpful to me as I have found it difficult at times to get my head around these concepts. Your post is informative and entertaining; and I love how you decided to experiment with LOD!

    ReplyDelete
  4. Fascinating post, Tom, and Glenn Close is never not pertinent. I believe the question of whether the internet constitutes a public space and the degree to which private interests have colonised that space such by encoding it to exclude public oversight will be a key question in how we manage the future development of cyberspace. This question is already poignant as we see how sites like Twitter, which have become a 'commons' can exclude points of view which run contrary to the company ethos, thus overriding the provisions for free speech which would be protected by law in other arenas.

    ReplyDelete
  5. I wonder whether this kind of principle could/should be backed up by primary legislation that forces these behemoths (as you say) to open their data...

    Is the issue that data is still largely considered a resource and commodity that gives competitive edge in a free market? How would you reconcile the want for privacy with the want for open society?

    Thank you for sharing your thoughts.

    ReplyDelete
  6. Very interesting experiment and observation, Thomas. It is a good reflectión and critique about the responsability of protecting and managing data.

    ReplyDelete