SlideShare a Scribd company logo
Hacking with Semantic Web


                           Tom Praison
                   Developer @ Yahoo!
         http://twitter.com/tompraison
What’s in here?
• Evolution of the web
• Poorly Solved Information Needs
• Semantic Web Technologies
• Linked Data
• Demo of confhopper.in, a site built using open
  datasets
• Some techniques for getting Structured
  Information from Web.
• Demo of Yahoo! Contextual Analysis Platform and
  Open Dapper
I just had to take the hypertext
                   idea and connect it to the
                   Transmission Control Protocol
                   and domain name system ideas
                   and—ta-da!—the World Wide
                   Web.




Tim Berners Lee – Inventor of the WWW
WEB 1.0
Few Content Creators! Majority Consumers!




                         http://www.flickr.com/photos/leandrociuffo/3665883373/
WEB 2.0




          Web as a platform
          http://www.flickr.com/photos/lambertwm/4737580179/
WEB 1.0 vs WEB 2.0

       Ofoto                                  Flickr



  Personal Website                          Blogging



  Britannica Online                        Wikipedia



Directories(taxonomy)                 Tagging(“folksonomy”)



Content Management                            Wikis
      Systems
WEB 3.0




      Which direction will it take?

                          http://www.flickr.com/photos/markhillary/337685031
Semantic Web




Virtual Web        WEB 3.0               Pervasive Web



                Could be anything!



                                    Artificial
      Personalization
                                  Intelligence
Today’s Web




A Web of Documents rather than Data!
Poorly Solved Information Needs
• Multiple interpretations
   – Apple
• Long tail queries
   – Roja (I meant a south indian actress)
• Imprecise or overly precise searches
   – jim hendler
   – pictures of strong adventures people
• Searches for descriptions
   – countries in africa
   – 25 year old computer engineer living in Bangalore
   – Reliable smart phone under 15,000 rupees
THE SOLUTION




               Semantic Web
Publish data on the Web
• Linked Data: linking data similar to how we link
  documents on the Web
• Query databases over the Web
Architectural Challenges
• A common format for sharing data
• Sharing the meaning of data
• Infrastructure
Semantic Web standards from W3C
• Data and schema
  languages
  (RDF, OWL, RIF)
• Document formats
  (RDF/XML, RDFa)
• Protocols
  (SPARQL, HTTP)
Current Researches & Other Efforts
• Semantic Web research into knowledge
  representation and reasoning, data
  integration, data quality and many other
  topics
• Community effort (Linked Data movement)
RDF (Resource Description
                Framework)
• The basic data model of the Semantic Web
   – A universal model to capture all sorts of data:
     networks, relational, object-oriented…
• Basic unit of information is a triple
   – A tuple of (subject, predicate, object)
   – Example: (Joe, loves, Mary)
   – Each triple gives the value of a property for a given
     resource or relates two objects to one another
      • Object is either a resource or a literal
• An RDF model is a set of triples
   – Ordering of statements in an RDF document is irrelevant
     (unlike XML)
Graphical and textual notation
                              foaf:Person
                    type

     my:Joe
                       name

                              “Joe A.”



A number of ways to serialize an RDF model into an
                RDF document
          RDF/XML, Turtle, N3, N-Triples
RDF is designed for the Web
• URIs provide web-wide global identification across datasets
   – A resource may be described by multiple
     documents
   – URIs are intended to be reused
   – Unique, but not single identifiers: two URIs may
     denote the same thing
RDF is designed for the Web
• URIs can be retrieved from the Web
   – A well-behaved URI returns a description of the
     resource
   – Provides authority: the definition of foaf:Person
     lives at that URI
• Ontologies can be looked up as well
   – Typically at the root of the URIs, also known as the
     namespace
   – Example: http://xmlns.com/foaf/0.1/Person
     redirects to the specification
URIs implicitly link data together

                                        (#joe, #loves, #mary)

(#joe, #name, “Joe A.”)
(#joe, #email, mailto:joe@joe.com)          A social networking site
                                                                       (#mary, name, “Mary B.”)
   Joe’s homepage                                                      (#mary, gender, “female”)

                                                                           Mary’s homepage

                              (#name, #type, #Property)
                              (#name, #domain, #Person)

                      Schema doc
Put together, triples form a single
           ‘global’ graph
               #name          “Joe A.”
#joe
                    #email

                              “joe@joe.com”

               #loves

                             #name       “Mary B.”

            #mary
                                 #gender

                                         “female”
RDF Example
Linked Data cloud: interlinked RDF
          datasets on the Web
http://linkeddata.org/
DBPedia
• Dbpedia is dataset that contains much of the
  structured data in Wikipedia
  – Data from the info-boxes
  – Links between Wikipedia pages
  – Categories
  – Disambiguation and redirect pages
• Links to other datasets
Fetching individual resources
• Use your web browser
  • http://dbpedia.org/resource/Yahoo redirects to
    http://dbpedia.org/page/Yahoo
  • You can plug in this URI into other Linked Data browsers
• HTTP GET to fetch data
  – Using curl: add Accept: application/rdf+xml for RDF
    and enable redirect
      • curl -L -H 'Accept:application/rdf+xml'
        'http://dbpedia.org/resource/Berlin’
• Data dumps
  – http://wiki.dbpedia.org/Datasets
Querying using SPARQL
• Interactive query builders
     • SPARQL Explorer: http://dbpedia.org/snorql/
     • Examples at: http://wiki.dbpedia.org/OnlineAccess
• Using HTTP GET
  – GET /sparql/?query=EncodedQuery HTTP/1.1
  – Example:
     • SELECT ?film ?x WHERE {
       ?film <http://dbpedia.org/ontology/language>
       <http://dbpedia.org/resource/French_language> . ?film
       <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
       <http://dbpedia.org/ontology/Film>}
     • curl 'http://dbpedia.org/sparql?query=encodedQuery’
ConfHopper.in
• Award winning app in WWW2012 Metadata
  Challenge.
• Confhopper.in is a desktop / mobile HTML5 based
  application designed for conference attendees.
• Built with the help of open datasets from
  http://data.semanticweb.org/ and various other
  sources.
Some Techniques for getting
   Structured Information from Web
• Semantic Markup
• NER
• Extraction Tools (Dapper)
Semantic Markup
•   Microdata (Schema.org)
•   RDFa
•   Open Graph Protocol (ogp.me)
•   Example:
    http://getschema.org/microdataextractor?url
    =http://www.tompraison.com&out=json
NER – Named Entity Recognition
• Yahoo! Content Analysis API
• http://developer.yahoo.com/contentanalysis/
Dapper




http://open.dapper.net

Dapper is a tool that enables users to create update feeds for
their favorite sites and website owners to optimize and
distribute their content in new ways.
References
• http://www.slideshare.net/tompraison
• http://inkdroid.org/journal/2010/06/04/the-
  5-stars-of-open-linked-data/
• http://www.freebase.com/
• http://dbpedia.org/About

Recommended

Semantic Web Austin Yahoo
Semantic Web Austin Yahoo
Peter Mika
 
Linked Open Data for Libraries
Linked Open Data for Libraries
Lukas Koster
 
An introduction to Semantic Web and Linked Data
An introduction to Semantic Web and Linked Data
Fabien Gandon
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
Richard Urban
 
Linked Open Data in Romania
Linked Open Data in Romania
Vlad Posea
 
Library Linked Data Progress
Library Linked Data Progress
Richard Wallis
 
Social semantic web
Social semantic web
Vlad Posea
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
Jon Voss
 
Linked Data at ISAW: How and Why
Linked Data at ISAW: How and Why
paregorios
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
Hong (Jenny) Jing
 
Exploring the Semantic Web
Exploring the Semantic Web
Roberto García
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data Environment
Diane Hillmann
 
Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)
ALATechSource
 
when the link makes sense
when the link makes sense
Fabien Gandon
 
Linked Data Modeling for Beginner
Linked Data Modeling for Beginner
Myungjin Lee
 
Deep Web Presentation April 25
Deep Web Presentation April 25
nagold
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
National Information Standards Organization (NISO)
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museums
trevorthornton
 
Forging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic Web
Gillian Byrne
 
The Buzz About BIBFRAME, by Angela Kroeger
The Buzz About BIBFRAME, by Angela Kroeger
Angela Kroeger
 
Libraries and Linked Data: Looking to the Future (1)
Libraries and Linked Data: Looking to the Future (1)
ALATechSource
 
Semantic Social Web
Semantic Social Web
Sabin Buraga
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss
 
Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)
ALATechSource
 
Library Linked Data
Library Linked Data
Dorothea Salo
 
Microformats I: What & Why
Microformats I: What & Why
Rachael L Moore
 
Semantic Web: A web that is not the Web
Semantic Web: A web that is not the Web
Bruce Esrig
 
Year of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkey
Peter Mika
 
Publishing data on the Semantic Web
Publishing data on the Semantic Web
Peter Mika
 

More Related Content

What's hot (20)

Linked Data at ISAW: How and Why
Linked Data at ISAW: How and Why
paregorios
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
Hong (Jenny) Jing
 
Exploring the Semantic Web
Exploring the Semantic Web
Roberto García
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data Environment
Diane Hillmann
 
Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)
ALATechSource
 
when the link makes sense
when the link makes sense
Fabien Gandon
 
Linked Data Modeling for Beginner
Linked Data Modeling for Beginner
Myungjin Lee
 
Deep Web Presentation April 25
Deep Web Presentation April 25
nagold
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
National Information Standards Organization (NISO)
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museums
trevorthornton
 
Forging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic Web
Gillian Byrne
 
The Buzz About BIBFRAME, by Angela Kroeger
The Buzz About BIBFRAME, by Angela Kroeger
Angela Kroeger
 
Libraries and Linked Data: Looking to the Future (1)
Libraries and Linked Data: Looking to the Future (1)
ALATechSource
 
Semantic Social Web
Semantic Social Web
Sabin Buraga
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss
 
Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)
ALATechSource
 
Library Linked Data
Library Linked Data
Dorothea Salo
 
Microformats I: What & Why
Microformats I: What & Why
Rachael L Moore
 
Semantic Web: A web that is not the Web
Semantic Web: A web that is not the Web
Bruce Esrig
 
Linked Data at ISAW: How and Why
Linked Data at ISAW: How and Why
paregorios
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
Hong (Jenny) Jing
 
Exploring the Semantic Web
Exploring the Semantic Web
Roberto García
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data Environment
Diane Hillmann
 
Libraries and Linked Data: Looking to the Future (3)
Libraries and Linked Data: Looking to the Future (3)
ALATechSource
 
when the link makes sense
when the link makes sense
Fabien Gandon
 
Linked Data Modeling for Beginner
Linked Data Modeling for Beginner
Myungjin Lee
 
Deep Web Presentation April 25
Deep Web Presentation April 25
nagold
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
jodischneider
 
Linked Open Data Fundamentals for Libraries, Archives and Museums
Linked Open Data Fundamentals for Libraries, Archives and Museums
trevorthornton
 
Forging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic Web
Gillian Byrne
 
The Buzz About BIBFRAME, by Angela Kroeger
The Buzz About BIBFRAME, by Angela Kroeger
Angela Kroeger
 
Libraries and Linked Data: Looking to the Future (1)
Libraries and Linked Data: Looking to the Future (1)
ALATechSource
 
Semantic Social Web
Semantic Social Web
Sabin Buraga
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
Jon Voss
 
Libraries and Linked Data: Looking to the Future (2)
Libraries and Linked Data: Looking to the Future (2)
ALATechSource
 
Microformats I: What & Why
Microformats I: What & Why
Rachael L Moore
 
Semantic Web: A web that is not the Web
Semantic Web: A web that is not the Web
Bruce Esrig
 

Similar to Hacking with Semantic Web (20)

Year of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkey
Peter Mika
 
Publishing data on the Semantic Web
Publishing data on the Semantic Web
Peter Mika
 
Semantic web
Semantic web
Tapas Kumar Mishra
 
Semantic web
Semantic web
Pallavi Srivastava
 
The Evolving Semantic Web
The Evolving Semantic Web
Barbara McGlamery
 
RDF Seminar Presentation
RDF Seminar Presentation
Muntazir Mehdi
 
Web 3.0 The Semantic Web
Web 3.0 The Semantic Web
Hatem Mahmoud
 
Semantic Web (IS 535 presentation) by ITRL students Deborah Ratliff and Maril...
Semantic Web (IS 535 presentation) by ITRL students Deborah Ratliff and Maril...
cmitch41
 
Hack U Barcelona 2011
Hack U Barcelona 2011
Peter Mika
 
Linked Data Basics
Linked Data Basics
Anja Jentzsch
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
sssw2011
 
The Semantic Web #4 - RDF (1)
The Semantic Web #4 - RDF (1)
Myungjin Lee
 
When?
When?
Dan Brickley
 
Publishing and Using Linked Data
Publishing and Using Linked Data
ostephens
 
Yahoo Making The Web Searchable
Yahoo Making The Web Searchable
kksst
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
adameq
 
Tutorial: Social Semantics
Tutorial: Social Semantics
Matthew Rowe
 
PR and Web 3.0
PR and Web 3.0
Philip Sheldrake
 
ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2
Martin Hepp
 
GoodRelations Tutorial Part 2
GoodRelations Tutorial Part 2
guestecacad2
 
Year of the Monkey: Lessons from the first year of SearchMonkey
Year of the Monkey: Lessons from the first year of SearchMonkey
Peter Mika
 
Publishing data on the Semantic Web
Publishing data on the Semantic Web
Peter Mika
 
RDF Seminar Presentation
RDF Seminar Presentation
Muntazir Mehdi
 
Web 3.0 The Semantic Web
Web 3.0 The Semantic Web
Hatem Mahmoud
 
Semantic Web (IS 535 presentation) by ITRL students Deborah Ratliff and Maril...
Semantic Web (IS 535 presentation) by ITRL students Deborah Ratliff and Maril...
cmitch41
 
Hack U Barcelona 2011
Hack U Barcelona 2011
Peter Mika
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
sssw2011
 
The Semantic Web #4 - RDF (1)
The Semantic Web #4 - RDF (1)
Myungjin Lee
 
Publishing and Using Linked Data
Publishing and Using Linked Data
ostephens
 
Yahoo Making The Web Searchable
Yahoo Making The Web Searchable
kksst
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
adameq
 
Tutorial: Social Semantics
Tutorial: Social Semantics
Matthew Rowe
 
ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2
Martin Hepp
 
GoodRelations Tutorial Part 2
GoodRelations Tutorial Part 2
guestecacad2
 

Recently uploaded (20)

Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
 
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
 
You are not excused! How to avoid security blind spots on the way to production
You are not excused! How to avoid security blind spots on the way to production
Michele Leroux Bustamante
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
Securing Account Lifecycles in the Age of Deepfakes.pptx
Securing Account Lifecycles in the Age of Deepfakes.pptx
FIDO Alliance
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
"Scaling in space and time with Temporal", Andriy Lupa.pdf
"Scaling in space and time with Temporal", Andriy Lupa.pdf
Fwdays
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
cnc-processing-centers-centateq-p-110-en.pdf
cnc-processing-centers-centateq-p-110-en.pdf
AmirStern2
 
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
The Growing Value and Application of FME & GenAI
The Growing Value and Application of FME & GenAI
Safe Software
 

Hacking with Semantic Web

  • 1. Hacking with Semantic Web Tom Praison Developer @ Yahoo! http://twitter.com/tompraison
  • 2. What’s in here? • Evolution of the web • Poorly Solved Information Needs • Semantic Web Technologies • Linked Data • Demo of confhopper.in, a site built using open datasets • Some techniques for getting Structured Information from Web. • Demo of Yahoo! Contextual Analysis Platform and Open Dapper
  • 3. I just had to take the hypertext idea and connect it to the Transmission Control Protocol and domain name system ideas and—ta-da!—the World Wide Web. Tim Berners Lee – Inventor of the WWW
  • 4. WEB 1.0 Few Content Creators! Majority Consumers! http://www.flickr.com/photos/leandrociuffo/3665883373/
  • 5. WEB 2.0 Web as a platform http://www.flickr.com/photos/lambertwm/4737580179/
  • 6. WEB 1.0 vs WEB 2.0 Ofoto Flickr Personal Website Blogging Britannica Online Wikipedia Directories(taxonomy) Tagging(“folksonomy”) Content Management Wikis Systems
  • 7. WEB 3.0 Which direction will it take? http://www.flickr.com/photos/markhillary/337685031
  • 8. Semantic Web Virtual Web WEB 3.0 Pervasive Web Could be anything! Artificial Personalization Intelligence
  • 9. Today’s Web A Web of Documents rather than Data!
  • 10. Poorly Solved Information Needs • Multiple interpretations – Apple • Long tail queries – Roja (I meant a south indian actress) • Imprecise or overly precise searches – jim hendler – pictures of strong adventures people • Searches for descriptions – countries in africa – 25 year old computer engineer living in Bangalore – Reliable smart phone under 15,000 rupees
  • 11. THE SOLUTION Semantic Web
  • 12. Publish data on the Web • Linked Data: linking data similar to how we link documents on the Web • Query databases over the Web
  • 13. Architectural Challenges • A common format for sharing data • Sharing the meaning of data • Infrastructure
  • 14. Semantic Web standards from W3C • Data and schema languages (RDF, OWL, RIF) • Document formats (RDF/XML, RDFa) • Protocols (SPARQL, HTTP)
  • 15. Current Researches & Other Efforts • Semantic Web research into knowledge representation and reasoning, data integration, data quality and many other topics • Community effort (Linked Data movement)
  • 16. RDF (Resource Description Framework) • The basic data model of the Semantic Web – A universal model to capture all sorts of data: networks, relational, object-oriented… • Basic unit of information is a triple – A tuple of (subject, predicate, object) – Example: (Joe, loves, Mary) – Each triple gives the value of a property for a given resource or relates two objects to one another • Object is either a resource or a literal • An RDF model is a set of triples – Ordering of statements in an RDF document is irrelevant (unlike XML)
  • 17. Graphical and textual notation foaf:Person type my:Joe name “Joe A.” A number of ways to serialize an RDF model into an RDF document RDF/XML, Turtle, N3, N-Triples
  • 18. RDF is designed for the Web • URIs provide web-wide global identification across datasets – A resource may be described by multiple documents – URIs are intended to be reused – Unique, but not single identifiers: two URIs may denote the same thing
  • 19. RDF is designed for the Web • URIs can be retrieved from the Web – A well-behaved URI returns a description of the resource – Provides authority: the definition of foaf:Person lives at that URI • Ontologies can be looked up as well – Typically at the root of the URIs, also known as the namespace – Example: http://xmlns.com/foaf/0.1/Person redirects to the specification
  • 20. URIs implicitly link data together (#joe, #loves, #mary) (#joe, #name, “Joe A.”) (#joe, #email, mailto:[email protected]) A social networking site (#mary, name, “Mary B.”) Joe’s homepage (#mary, gender, “female”) Mary’s homepage (#name, #type, #Property) (#name, #domain, #Person) Schema doc
  • 21. Put together, triples form a single ‘global’ graph #name “Joe A.” #joe #email “[email protected]” #loves #name “Mary B.” #mary #gender “female”
  • 23. Linked Data cloud: interlinked RDF datasets on the Web http://linkeddata.org/
  • 24. DBPedia • Dbpedia is dataset that contains much of the structured data in Wikipedia – Data from the info-boxes – Links between Wikipedia pages – Categories – Disambiguation and redirect pages • Links to other datasets
  • 25. Fetching individual resources • Use your web browser • http://dbpedia.org/resource/Yahoo redirects to http://dbpedia.org/page/Yahoo • You can plug in this URI into other Linked Data browsers • HTTP GET to fetch data – Using curl: add Accept: application/rdf+xml for RDF and enable redirect • curl -L -H 'Accept:application/rdf+xml' 'http://dbpedia.org/resource/Berlin’ • Data dumps – http://wiki.dbpedia.org/Datasets
  • 26. Querying using SPARQL • Interactive query builders • SPARQL Explorer: http://dbpedia.org/snorql/ • Examples at: http://wiki.dbpedia.org/OnlineAccess • Using HTTP GET – GET /sparql/?query=EncodedQuery HTTP/1.1 – Example: • SELECT ?film ?x WHERE { ?film <http://dbpedia.org/ontology/language> <http://dbpedia.org/resource/French_language> . ?film <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Film>} • curl 'http://dbpedia.org/sparql?query=encodedQuery’
  • 27. ConfHopper.in • Award winning app in WWW2012 Metadata Challenge. • Confhopper.in is a desktop / mobile HTML5 based application designed for conference attendees. • Built with the help of open datasets from http://data.semanticweb.org/ and various other sources.
  • 28. Some Techniques for getting Structured Information from Web • Semantic Markup • NER • Extraction Tools (Dapper)
  • 29. Semantic Markup • Microdata (Schema.org) • RDFa • Open Graph Protocol (ogp.me) • Example: http://getschema.org/microdataextractor?url =http://www.tompraison.com&out=json
  • 30. NER – Named Entity Recognition • Yahoo! Content Analysis API • http://developer.yahoo.com/contentanalysis/
  • 31. Dapper http://open.dapper.net Dapper is a tool that enables users to create update feeds for their favorite sites and website owners to optimize and distribute their content in new ways.
  • 32. References • http://www.slideshare.net/tompraison • http://inkdroid.org/journal/2010/06/04/the- 5-stars-of-open-linked-data/ • http://www.freebase.com/ • http://dbpedia.org/About