SlideShare a Scribd company logo
Web Science & Technologies

University of Koblenz ▪ Landau, Germany

Information-Rich Programming
in F#
with Semantic Data
Linked Open Data Cloud
Where’s the Data in
the Big Data Wave?
Gerhard Weikum
SIGMOD Blog, 6.3.2013
http://wp.sigmod.org/

… the Web of Linked Data consisting of
more than 30 Billion RDF triples from
hundreds of data sources …

WeST

Steffen Staab
staab@uni-koblenz.de

2
Some „Bubbles“ of the LOD Cloud

WeST

Steffen Staab
staab@uni-koblenz.de

3
RDF: Simple Foundations

WeST

Steffen Staab
staab@uni-koblenz.de

4
Example RDF Graph

Native Graph
OR
R2RML: RDB to RDF Mapping Language
(W3C rec)
WeST

Steffen Staab
staab@uni-koblenz.de

5
Agenda

LiteQ – Language integrated types,
extensions and queries for RDF graphs
 Exploring
 Programming, Typing
Evaluation of LITEQ (NPQL) against SPARQL
Understandability
Ease of use
SchemEX
Construction of schema-based index
Schema induction
WeST

Steffen Staab
staab@uni-koblenz.de

6
Programming against unknown data source

Exploring a
data source

WeST

Steffen Staab
staab@uni-koblenz.de

Using a data
source

7
Example application

• Goal: Application that helps to collect dog license fee
• Send Email reminders to dog owners

• Data is given as RDF graph

WeST

Steffen Staab
staab@uni-koblenz.de

8
Programmer‘s Task 1: Schema Exploration

Schema exploration & Identification of important RDF types
• Find RDF types representing dogs and persons

WeST

Steffen Staab
staab@uni-koblenz.de

9
Naive Approach Task 1: Schema Exploration

Schema exploration & Identification of important RDF types
• Find RDF types representing dogs and persons
Tooling for Naïve Approach: SPARQL Query Formulation

WeST

Steffen Staab
staab@uni-koblenz.de

10
Programmer‘s Task 2: Code Type Creation

Code type creation in host language
• Convert the identified dog and person RDF types to
code types in the host language
type exCreature(uri) = class
member this.hasName : String = …
Member this.hasAge : int = …
end
type exDog(uri) = class
inherit exCreature(uri)
member this.hasOwner : exPerson = …
member this.TaxNo : Integer = …
end
type exPerson(uri) = class
inherit exCreature(uri)
end
WeST

Steffen Staab
staab@uni-koblenz.de

11
Programmer‘s Task 3: Data querying

Data querying
• Write a query that returns all dog owners

WeST

Steffen Staab
staab@uni-koblenz.de

12
Naive Approach Task 3: Data querying

Data querying
• Write a query that returns all dog owners

Tooling for Naive Approach: SPARQL Query formulation

WeST

Steffen Staab
staab@uni-koblenz.de

13
Naive Approach Task 4: Object manipulation

Create the objects, manipulate them & make them persistent
• Develop functionality around query to send reminder

let queryString = “SELECT ?owner WHERE {
?dog rdf:type exDog.
?dog ex:hasOwner ?owner
}“

dbConnection.evaluate(queryString) |> Seq.iter ( fun uri ->
let p = new Person(uri)
sendReminderEmail(p)
)

WeST

Steffen Staab
staab@uni-koblenz.de

14
The LITEQ approach

WeST

Steffen Staab
staab@uni-koblenz.de

15
Node Path Query Language

WeST

Steffen Staab
staab@uni-koblenz.de

16
Graph Traversal with NPQL: Subtype Navigation >
NPQL

rdf:Resource > ex:Creature

WeST

Steffen Staab
staab@uni-koblenz.de

17
Graph Traversal with NPQL: Property Navigation .
NPQL

ex:Dog . ex:hasOwner

WeST

Steffen Staab
staab@uni-koblenz.de

18
Extensional Semantics: Task 3 – Querying for Owners
NPQL

rdf:Resource > ex:Dog
ex:Creature > ex:Dog . ex:hasOwner
-> Extension
• Select ex:Dog
• Walk through
ex:hasOwner to
ex:Person
• Use extension to
retrieve all persons
who own dogs:
ex:Bob
WeST

Steffen Staab
staab@uni-koblenz.de

19
Intensional Semantics: Task 2 - Creating Person Code Type
NPQL

rdf:Resource > ex:Creature > ex:Dog.hasOwner ->
Intension
• Select ex:Person node
• “Intension”
to get code type
based on rdf type
type exCreature(uri) = class
member this.hasName : String = …
Member this.hasAge : int = …
end
type exPerson(uri) = class
inherit exCreature(uri)
WeST
Steffen Staab
end
staab@uni-koblenz.de

20
Autocompletion Semantics: Task 1 - Exploration
NPQL

rdf:Resource > ex:Creature >
Suggestions during query writing
• Instances based on
extensional semantics
• Types & Props
based on intensional
semantics

ex:Person, ex:Dog
WeST

Steffen Staab
staab@uni-koblenz.de

21
Extensional Semantics: LA Conjunctive Queries
NPQL

ex:Dog <- ex:hasOwner
Left associative
conjunctive query
with projection

WeST

Steffen Staab
staab@uni-koblenz.de

22
Host Language Extension: Task 4 – Create Objects

Create the objects, manipulation & persistence
• Develop the functionality around the query
that will send the reminder using LITEQ in F#

Preliminary Implementation in F#
http://west.uni-koblenz.de/Research/systems/liteq
WeST

Steffen Staab
staab@uni-koblenz.de

23
Web Science & Technologies

University of Koblenz ▪ Landau, Germany

Live demo of LITEQ in Visual Studio/F#
Related Work
Task

LINQ

XML Freebase
Type
Type
Provider Provider

LITEQ
current
version

LITEQ
Concept

1 Schema
exploration

-

(✔)
per doc

(✔)
only trees

✔

✔

2 Code type
creation

-

(✔)
erased
types?

(✔)
erased types

(✔)
erased types

✔
full
hierarchy

✔

-

((✔))
very limited
expressiv.

(✔)
limited
expressiv.

✔
no full
SPARQL

(✔)

✔

-

✔
no new object
creation

✔

3 Data
querying
4 Object
manipulation
& persistence
WeST

Steffen Staab
staab@uni-koblenz.de

26
Future work wrt LITEQ

• Current implementation is a prototype
• Current implementation uses erased types
 At runtime, no type hierarchy is present
• Switch to generated types in the future
 Higher expressiveness in the host language
exploiting type hierarchy
• Optimizations of LITEQ implementation necessary
• Lazy evaluation
• Distinguish between design time and runtime
• Not all types created at design time are needed at
runtime
• Formalize query language and investigate expressiveness
WeST

Steffen Staab
staab@uni-koblenz.de

27
Challenge: Joint Type Inference

Data modeling world
Description Logics

Program modeling world
ML type inference

RDF

UML class
diagrams

WeST

Steffen Staab
staab@uni-koblenz.de

28
Agenda

LiteQ – Language integrated types, extensions
and queries for RDF graphs
 Exploring
 Programming, Typing
Evaluation of LITEQ (NPQL) vs. SPARQL
Understandability
Ease of use
SchemEX
Where do I find relevant data?
Efficient construction of a
schema-level index
WeST

Steffen Staab
staab@uni-koblenz.de

29
Preliminary Evaluation of LITEQ/NPQL

Focused on NPQL
• Reason:
Test subjects lacked knowledge of F# and functional
programming for evaluating LITEQ in full
• Comparing NPQL against SPARQL
Main Hypothesis of Evaluation
• NPQL with autocompletion allows for effective query
writing in more efficient manner than SPARQL

Thus: some of the advantages of LITEQ cannot show up in
the evaluation!
WeST

Steffen Staab
staab@uni-koblenz.de

30
Evaluation Subjects

Evaluation with 11 participants
• 1 subject a posteriori eliminated from analysis of evaluation,
because he could not deal with SPARQL at all!
• 10 subjects remaining for analysis
Participants
• Undergraduate students
• PhD students
• PostDocs

WeST

Steffen Staab
staab@uni-koblenz.de

31
Evaluation - Setup

1. Pre-questionaire
1. Training in RDF, SPARQL & NPQL
1. Experimental tasks to be solved by subjects
1. Post-questionaire

WeST

Steffen Staab
staab@uni-koblenz.de

32
Phase 1: Pre-Questionnaire – Knowledge & skills

• Programming:
All
• Object-orientation:
8
• Functional programming:

 “Intermediate” or above
 “Intermediate” or above

4 Intermediate” or above
Lisp, Haskell, F# (once)
4 none”

• .NET
1 Expert”
2 Beginner”
7 none”

• SPARQL:
3 Intermediate” or above
7 below “intermediate”
WeST

Steffen Staab
staab@uni-koblenz.de

[Sparql Experts]
[Sparql Novices]

33
Phase 2: Training in RDF, SPARQL, NPQL

Training in RDF & SPARQL
• Presentation of RDF & SPARQL (20 minutes)
• Practical excercise writing SPARQL queries
in the Web interface (5 minutes)
Training in NPQL
• Practical excercise writing NPQL queries in Visual Studio
(5 minutes)

WeST

Steffen Staab
staab@uni-koblenz.de

34
Phase 3: Solving experimental tasks by subjects

9 different experimental tasks to solve
• Half of tasks in NPQL using Visual Studio
• Other half using SPARQL and a web interface
Task types
• Navigation and exploration of a data source (Task 1)
• Retrieving and answering questions about the data (Task 3)
• 2 tasks were not solvable in NPQL
• Investigating how users deal with limits of the language
Evaluation measure:
•
Duration to complete each task
WeST

Steffen Staab
staab@uni-koblenz.de

35
Evaluation across different user types

WeST

Steffen Staab
staab@uni-koblenz.de

36
Evaluations per Task

WeST

Steffen Staab
staab@uni-koblenz.de

37
Phase 4: Post-Questionnaire
“Do you want to explore a data source in your IDE?”
4 yes”
3 no, prefer separation of steps”
3 no preference”
“NPQL is easier to use than SPARQL”
7 agree” or above

My conclusion
Other Though LITEQ is still in a pre-alpha status,
• Better supportadvantages queries in SPARQL
when writing became visible
in times for interactive working with
• Better responsepreliminary user evaluation NPQL
WeST

Steffen Staab
staab@uni-koblenz.de

38
Agenda

LiteQ – Language integrated types, extensions
and queries for RDF graphs
 Exploring
 Programming, Typing
Evaluation of LITEQ (NPQL) against SPARQL
Understandability
Ease of use
SchemEX
Construction of schema-based index
Schema induction
WeST

Steffen Staab
staab@uni-koblenz.de

39
Searching the LOD cloud

SELECT ?x
foaf:Document
WHERE {
?x rdf:type foaf:Document .
?x rdf:type swrc:InProceedings .
?x dc:creator ?y .
x
?y rdf:type fb:Computer_Scientist
}

?
WeST

Steffen Staab
staab@uni-koblenz.de

40

swrc:InProceedings

fb:Computer_Scientist

dc:creator
Searching the LOD cloud
SELECT ?x
WHERE {
?x rdf:type foaf:Document .
?x rdf:type swrc:InProceedings .
?x dc:creator ?y .
?y rdf:type fb:Computer_Scientist
}

Index
WeST

Steffen Staab
staab@uni-koblenz.de

41

• ACM
• DBLP
Schema-level index

Schema information on LOD

Explicit

Implicit

Assigning class types

Modelling attributes

Class
rdf:type

Property

Entity 2

Entity

Entity

WeST

Steffen Staab
staab@uni-koblenz.de

42
Schema-level index

C1

C3

C2
P1

DS1

C1

P2
C3

C2
P1

E1
P2

WeST

E2

XYZ

Steffen Staab
staab@uni-koblenz.de

DS1

43
Typecluster

 Entities with the same Set of types

C1

C2

...

Cn

...

DSm

TCj

DS1

WeST

Steffen Staab
staab@uni-koblenz.de

DS2

44
Typecluster: Example

foaf:Document

swrc:InProceedings

tc2309

DBLP

WeST

Steffen Staab
staab@uni-koblenz.de

ACM

45
Bi-Simulation

 Entities are equivalent, if they refer with the same
attributes to equivalent entities
 Restriction: 1-Bi-Simulation

P1

P2

...

Pn

...

DSm

BSi

DS1

WeST

Steffen Staab
staab@uni-koblenz.de

DS2

46
Bi-Simulation: Example

dc:creator

bs2608

BBC

WeST

Steffen Staab
staab@uni-koblenz.de

DBLP

47
SchemEX: Combination TC and Bi-Simulation

 Partition of TC based on 1-Bi-Simulation with
restrictions on the destination TC

Schema

C1

Payload

...

Cn

C45

C2

TCj

...

Cn„

TCk
BSi

EQC

WeST

C2

DS1

EQCj

DS2

P1

P2

...

Pn

... DSm

Steffen Staab
staab@uni-koblenz.de

EQC

DS

48
SchemEX: Example
foaf:Document

swrc:InProceedings

fb:Computer_Scientist

tc2309

tc2101
bs260
8

eqc707

DBLP

WeST

Steffen Staab
staab@uni-koblenz.de

dc:creator
...

SELECT ?x
WHERE {
?x rdf:type foaf:Document .
?x rdf:type swrc:InProceedings
?x dc:creator ?y .
?y rdf:type fb:Computer_Scient
}

49
SchemEX: Computation

 Precise computation: Brute-Force

Schema

C1

Payload

...

Cn

C12

C2

TCj

...

Cn„

TCk
BSi

EQC

WeST

C2

DS1

EQCj

DS2

P1

P2

...

Pn

... DSm

Steffen Staab
staab@uni-koblenz.de

EQC

DS

50
Stream-based Computation of SchemEX

 LOD Crawler: Stream of n-Quads (triple + data source)
… Q16, Q15, Q14, Q13, Q12, Q11, Q10, Q9, Q8, Q7, Q6, Q5, Q4, Q3, Q2, Q1

FiFo
1
C3

4

C2

3

6
4

2
C2

2
3

1
5

C1

WeST

Steffen Staab
staab@uni-koblenz.de

51
Quality of Approximated Index

Stream-based computation vs. brute force
Data set of 11 Mio. tripel

WeST

Steffen Staab
staab@uni-koblenz.de

52
SchemEX @ BTC 2011

SchemEX
Allows complex queries (Star, Chain)
Scalable computation
High quality

Index over BTC 2011 data
2.17 billion tripel
Index: 55 million tripel

Commodity hardware
VM: 1 Core, 4 GB RAM
Throughput: 39.500 tripel / second
Computation of full index: 15h
WeST

Steffen Staab
staab@uni-koblenz.de

53
Future work wrt SchemEX

Further exploration of
• schema induction
• query federation
Federation vs Link Traversal based query execution
• Granularity of query execution
• Too fine grained: URI dereferencing
• Too expressive: SPARQL
• Sweet spot -> NPQL??

WeST

Steffen Staab
staab@uni-koblenz.de

54
Agenda

LiteQ – Language integrated types, extensions
and queries for RDF graphs
 Exploring
 Programming, Typing
Evaluation of LITEQ (NPQL) against SPARQL
Understandability
Ease of use
SchemEX
Construction of schema-based index
Schema induction
WeST

Steffen Staab
staab@uni-koblenz.de

55
Future

1.
2.
3.
4.

Searching for distributed data
Understanding distributed data
Intelligent queries on distributed data
Programming with distributed data
• Type reuse
• Type induction

WeST

Steffen Staab
staab@uni-koblenz.de

56
Web Science & Technologies

University of Koblenz ▪ Landau, Germany

Thank you for your attention!

More Related Content

PPTX
RDF Stream Processing: Let's React
PPT
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
PPTX
RDF Stream Processing Tutorial: RSP implementations
PDF
RSP4J: An API for RDF Stream Processing
PDF
Overview of the SPARQL-Generate language and latest developments
PDF
A Context-Based Semantics for SPARQL Property Paths over the Web
PPTX
Intro to Python for C# Developers
PDF
Summary of the Stream Reasoning workshop at ISWC 2016
RDF Stream Processing: Let's React
OrdRing 2013 keynote - On the need for a W3C community group on RDF Stream Pr...
RDF Stream Processing Tutorial: RSP implementations
RSP4J: An API for RDF Stream Processing
Overview of the SPARQL-Generate language and latest developments
A Context-Based Semantics for SPARQL Property Paths over the Web
Intro to Python for C# Developers
Summary of the Stream Reasoning workshop at ISWC 2016

What's hot (18)

PPT
On the need for a W3C community group on RDF Stream Processing
PDF
LDQL: A Query Language for the Web of Linked Data
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
PDF
Embedded interactive learning analytics dashboards with Elasticsearch and Kib...
PDF
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]
PDF
Making the big data ecosystem work together with python apache arrow, spark,...
PDF
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
PDF
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
PDF
GluonNLP MXNet Meetup-Aug
PDF
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
PDF
apidays LIVE Paris 2021 - GraphQL Today and Tomorrow by Uri Goldshtein, The G...
PDF
Fulfilling Apache Arrow's Promises: Pandas on JVM memory without a copy
PDF
Batch import of large RDF datasets into Semantic MediaWiki
PPTX
Overview of the TREC 2019 Deep Learning Track
PPTX
Data Science at Scale: Using Apache Spark for Data Science at Bitly
PPTX
Kaggle Competitions, New Friends, New Skills and New Opportunities
PDF
Congressional PageRank: Graph Analytics of US Congress With Neo4j
PPTX
The Reality of Digital Transfer @ArchivesNZ
On the need for a W3C community group on RDF Stream Processing
LDQL: A Query Language for the Web of Linked Data
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 1 (...
Embedded interactive learning analytics dashboards with Elasticsearch and Kib...
Large Scale Image Forensics using Tika and Tensorflow [ICMR MFSec 2017]
Making the big data ecosystem work together with python apache arrow, spark,...
Gradoop: Scalable Graph Analytics with Apache Flink @ Flink & Neo4j Meetup Be...
Data Programming: Creating Large Datasets, Quickly -- Presented at JPL MLRG
GluonNLP MXNet Meetup-Aug
Tutorial "An Introduction to SPARQL and Queries over Linked Data" Chapter 3 (...
apidays LIVE Paris 2021 - GraphQL Today and Tomorrow by Uri Goldshtein, The G...
Fulfilling Apache Arrow's Promises: Pandas on JVM memory without a copy
Batch import of large RDF datasets into Semantic MediaWiki
Overview of the TREC 2019 Deep Learning Track
Data Science at Scale: Using Apache Spark for Data Science at Bitly
Kaggle Competitions, New Friends, New Skills and New Opportunities
Congressional PageRank: Graph Analytics of US Congress With Neo4j
The Reality of Digital Transfer @ArchivesNZ
Ad

Viewers also liked (6)

PPT
Arts in Healthcare
PPT
Ce functional design workshop
PPTX
SA Paving | Paving Gallery
PPT
Proiect tic a_1c_naca_elena andreea
PDF
Szkolenie UX: nawigacja, formularze, tabele
PDF
Akta 550(1)
Arts in Healthcare
Ce functional design workshop
SA Paving | Paving Gallery
Proiect tic a_1c_naca_elena andreea
Szkolenie UX: nawigacja, formularze, tabele
Akta 550(1)
Ad

Similar to Information-Rich Programming in F# with Semantic Data (20)

PPTX
Programming the Semantic Web
PDF
Staab programming thesemanticweb
PDF
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
PPTX
Programming the Semantic Web
PPTX
Predicting query performance and explaining results to assist Linked Data con...
PDF
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
PPTX
Semantic Technologies and Programmatic Access to Semantic Data
PPTX
Programming with Semantic Broad Data
PDF
inteSearch: An Intelligent Linked Data Information Access Framework
PDF
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
PDF
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
PPTX
Strategies for Processing and Explaining Distributed Queries on Linked Data
PDF
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
PDF
Hide the Stack: Toward Usable Linked Data
PPTX
Knowledge Graph Introduction
PPTX
What;s Coming In SPARQL2?
PPTX
Scala Programming for Semantic Web Developers ESWC Semdev2015
PPT
Re-using Media on the Web: Media fragment re-mixing and playout
PDF
Hala skafkeynote@conferencedata2021
PPTX
Developing and maintaining a Java GraphQL back-end: The less obvious - Bojan ...
Programming the Semantic Web
Staab programming thesemanticweb
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
Programming the Semantic Web
Predicting query performance and explaining results to assist Linked Data con...
RDFUnit - Test-Driven Linked Data quality Assessment (WWW2014)
Semantic Technologies and Programmatic Access to Semantic Data
Programming with Semantic Broad Data
inteSearch: An Intelligent Linked Data Information Access Framework
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Strategies for Processing and Explaining Distributed Queries on Linked Data
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
Hide the Stack: Toward Usable Linked Data
Knowledge Graph Introduction
What;s Coming In SPARQL2?
Scala Programming for Semantic Web Developers ESWC Semdev2015
Re-using Media on the Web: Media fragment re-mixing and playout
Hala skafkeynote@conferencedata2021
Developing and maintaining a Java GraphQL back-end: The less obvious - Bojan ...

More from Steffen Staab (20)

PDF
Towards Scientific Foundation Models (Invited Talk)
PPTX
Investigating Fairness of Decision Making
PPTX
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
PPTX
Knowledge graphs for knowing more and knowing for sure
PPTX
Symbolic Background Knowledge for Machine Learning
PPTX
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
PPTX
Web Futures: Inclusive, Intelligent, Sustainable
PPTX
Eyeing the Web
PPTX
Concepts in Application Context ( How we may think conceptually )
PDF
Storing and Querying Semantic Data in the Cloud
PPTX
Semantics reloaded
PPTX
Ontologien und Semantic Web - Impulsvortrag Terminologietag
PPTX
Opinion Formation and Spreading
PPTX
The Web We Want
PPTX
10 Jahre Web Science
PPTX
(Semi-)Automatic analysis of online contents
PPTX
Text Mining using LDA with Context
PPTX
Wwsss intro2016-final
PPTX
10 Years Web Science
PPTX
Semantic Web Technologies: Principles and Practices
Towards Scientific Foundation Models (Invited Talk)
Investigating Fairness of Decision Making
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Knowledge graphs for knowing more and knowing for sure
Symbolic Background Knowledge for Machine Learning
Soziale Netzwerke und Medien: Multi-disziplinäre Ansätze für ein multi-dimens...
Web Futures: Inclusive, Intelligent, Sustainable
Eyeing the Web
Concepts in Application Context ( How we may think conceptually )
Storing and Querying Semantic Data in the Cloud
Semantics reloaded
Ontologien und Semantic Web - Impulsvortrag Terminologietag
Opinion Formation and Spreading
The Web We Want
10 Jahre Web Science
(Semi-)Automatic analysis of online contents
Text Mining using LDA with Context
Wwsss intro2016-final
10 Years Web Science
Semantic Web Technologies: Principles and Practices

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
sap open course for s4hana steps from ECC to s4
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PPTX
Machine Learning_overview_presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Chapter 3 Spatial Domain Image Processing.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
sap open course for s4hana steps from ECC to s4
The AUB Centre for AI in Media Proposal.docx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
Electronic commerce courselecture one. Pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Spectral efficient network and resource selection model in 5G networks
Advanced methodologies resolving dimensionality complications for autism neur...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
Machine Learning_overview_presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation

Information-Rich Programming in F# with Semantic Data

  • 1. Web Science & Technologies University of Koblenz ▪ Landau, Germany Information-Rich Programming in F# with Semantic Data
  • 2. Linked Open Data Cloud Where’s the Data in the Big Data Wave? Gerhard Weikum SIGMOD Blog, 6.3.2013 http://wp.sigmod.org/ … the Web of Linked Data consisting of more than 30 Billion RDF triples from hundreds of data sources … WeST Steffen Staab [email protected] 2
  • 3. Some „Bubbles“ of the LOD Cloud WeST Steffen Staab [email protected] 3
  • 5. Example RDF Graph Native Graph OR R2RML: RDB to RDF Mapping Language (W3C rec) WeST Steffen Staab [email protected] 5
  • 6. Agenda LiteQ – Language integrated types, extensions and queries for RDF graphs  Exploring  Programming, Typing Evaluation of LITEQ (NPQL) against SPARQL Understandability Ease of use SchemEX Construction of schema-based index Schema induction WeST Steffen Staab [email protected] 6
  • 7. Programming against unknown data source Exploring a data source WeST Steffen Staab [email protected] Using a data source 7
  • 8. Example application • Goal: Application that helps to collect dog license fee • Send Email reminders to dog owners • Data is given as RDF graph WeST Steffen Staab [email protected] 8
  • 9. Programmer‘s Task 1: Schema Exploration Schema exploration & Identification of important RDF types • Find RDF types representing dogs and persons WeST Steffen Staab [email protected] 9
  • 10. Naive Approach Task 1: Schema Exploration Schema exploration & Identification of important RDF types • Find RDF types representing dogs and persons Tooling for Naïve Approach: SPARQL Query Formulation WeST Steffen Staab [email protected] 10
  • 11. Programmer‘s Task 2: Code Type Creation Code type creation in host language • Convert the identified dog and person RDF types to code types in the host language type exCreature(uri) = class member this.hasName : String = … Member this.hasAge : int = … end type exDog(uri) = class inherit exCreature(uri) member this.hasOwner : exPerson = … member this.TaxNo : Integer = … end type exPerson(uri) = class inherit exCreature(uri) end WeST Steffen Staab [email protected] 11
  • 12. Programmer‘s Task 3: Data querying Data querying • Write a query that returns all dog owners WeST Steffen Staab [email protected] 12
  • 13. Naive Approach Task 3: Data querying Data querying • Write a query that returns all dog owners Tooling for Naive Approach: SPARQL Query formulation WeST Steffen Staab [email protected] 13
  • 14. Naive Approach Task 4: Object manipulation Create the objects, manipulate them & make them persistent • Develop functionality around query to send reminder let queryString = “SELECT ?owner WHERE { ?dog rdf:type exDog. ?dog ex:hasOwner ?owner }“ dbConnection.evaluate(queryString) |> Seq.iter ( fun uri -> let p = new Person(uri) sendReminderEmail(p) ) WeST Steffen Staab [email protected] 14
  • 16. Node Path Query Language WeST Steffen Staab [email protected] 16
  • 17. Graph Traversal with NPQL: Subtype Navigation > NPQL rdf:Resource > ex:Creature WeST Steffen Staab [email protected] 17
  • 18. Graph Traversal with NPQL: Property Navigation . NPQL ex:Dog . ex:hasOwner WeST Steffen Staab [email protected] 18
  • 19. Extensional Semantics: Task 3 – Querying for Owners NPQL rdf:Resource > ex:Dog ex:Creature > ex:Dog . ex:hasOwner -> Extension • Select ex:Dog • Walk through ex:hasOwner to ex:Person • Use extension to retrieve all persons who own dogs: ex:Bob WeST Steffen Staab [email protected] 19
  • 20. Intensional Semantics: Task 2 - Creating Person Code Type NPQL rdf:Resource > ex:Creature > ex:Dog.hasOwner -> Intension • Select ex:Person node • “Intension” to get code type based on rdf type type exCreature(uri) = class member this.hasName : String = … Member this.hasAge : int = … end type exPerson(uri) = class inherit exCreature(uri) WeST Steffen Staab end [email protected] 20
  • 21. Autocompletion Semantics: Task 1 - Exploration NPQL rdf:Resource > ex:Creature > Suggestions during query writing • Instances based on extensional semantics • Types & Props based on intensional semantics ex:Person, ex:Dog WeST Steffen Staab [email protected] 21
  • 22. Extensional Semantics: LA Conjunctive Queries NPQL ex:Dog <- ex:hasOwner Left associative conjunctive query with projection WeST Steffen Staab [email protected] 22
  • 23. Host Language Extension: Task 4 – Create Objects Create the objects, manipulation & persistence • Develop the functionality around the query that will send the reminder using LITEQ in F# Preliminary Implementation in F# http://west.uni-koblenz.de/Research/systems/liteq WeST Steffen Staab [email protected] 23
  • 24. Web Science & Technologies University of Koblenz ▪ Landau, Germany Live demo of LITEQ in Visual Studio/F#
  • 25. Related Work Task LINQ XML Freebase Type Type Provider Provider LITEQ current version LITEQ Concept 1 Schema exploration - (✔) per doc (✔) only trees ✔ ✔ 2 Code type creation - (✔) erased types? (✔) erased types (✔) erased types ✔ full hierarchy ✔ - ((✔)) very limited expressiv. (✔) limited expressiv. ✔ no full SPARQL (✔) ✔ - ✔ no new object creation ✔ 3 Data querying 4 Object manipulation & persistence WeST Steffen Staab [email protected] 26
  • 26. Future work wrt LITEQ • Current implementation is a prototype • Current implementation uses erased types  At runtime, no type hierarchy is present • Switch to generated types in the future  Higher expressiveness in the host language exploiting type hierarchy • Optimizations of LITEQ implementation necessary • Lazy evaluation • Distinguish between design time and runtime • Not all types created at design time are needed at runtime • Formalize query language and investigate expressiveness WeST Steffen Staab [email protected] 27
  • 27. Challenge: Joint Type Inference Data modeling world Description Logics Program modeling world ML type inference RDF UML class diagrams WeST Steffen Staab [email protected] 28
  • 28. Agenda LiteQ – Language integrated types, extensions and queries for RDF graphs  Exploring  Programming, Typing Evaluation of LITEQ (NPQL) vs. SPARQL Understandability Ease of use SchemEX Where do I find relevant data? Efficient construction of a schema-level index WeST Steffen Staab [email protected] 29
  • 29. Preliminary Evaluation of LITEQ/NPQL Focused on NPQL • Reason: Test subjects lacked knowledge of F# and functional programming for evaluating LITEQ in full • Comparing NPQL against SPARQL Main Hypothesis of Evaluation • NPQL with autocompletion allows for effective query writing in more efficient manner than SPARQL Thus: some of the advantages of LITEQ cannot show up in the evaluation! WeST Steffen Staab [email protected] 30
  • 30. Evaluation Subjects Evaluation with 11 participants • 1 subject a posteriori eliminated from analysis of evaluation, because he could not deal with SPARQL at all! • 10 subjects remaining for analysis Participants • Undergraduate students • PhD students • PostDocs WeST Steffen Staab [email protected] 31
  • 31. Evaluation - Setup 1. Pre-questionaire 1. Training in RDF, SPARQL & NPQL 1. Experimental tasks to be solved by subjects 1. Post-questionaire WeST Steffen Staab [email protected] 32
  • 32. Phase 1: Pre-Questionnaire – Knowledge & skills • Programming: All • Object-orientation: 8 • Functional programming:  “Intermediate” or above  “Intermediate” or above 4 Intermediate” or above Lisp, Haskell, F# (once) 4 none” • .NET 1 Expert” 2 Beginner” 7 none” • SPARQL: 3 Intermediate” or above 7 below “intermediate” WeST Steffen Staab [email protected] [Sparql Experts] [Sparql Novices] 33
  • 33. Phase 2: Training in RDF, SPARQL, NPQL Training in RDF & SPARQL • Presentation of RDF & SPARQL (20 minutes) • Practical excercise writing SPARQL queries in the Web interface (5 minutes) Training in NPQL • Practical excercise writing NPQL queries in Visual Studio (5 minutes) WeST Steffen Staab [email protected] 34
  • 34. Phase 3: Solving experimental tasks by subjects 9 different experimental tasks to solve • Half of tasks in NPQL using Visual Studio • Other half using SPARQL and a web interface Task types • Navigation and exploration of a data source (Task 1) • Retrieving and answering questions about the data (Task 3) • 2 tasks were not solvable in NPQL • Investigating how users deal with limits of the language Evaluation measure: • Duration to complete each task WeST Steffen Staab [email protected] 35
  • 35. Evaluation across different user types WeST Steffen Staab [email protected] 36
  • 37. Phase 4: Post-Questionnaire “Do you want to explore a data source in your IDE?” 4 yes” 3 no, prefer separation of steps” 3 no preference” “NPQL is easier to use than SPARQL” 7 agree” or above My conclusion Other Though LITEQ is still in a pre-alpha status, • Better supportadvantages queries in SPARQL when writing became visible in times for interactive working with • Better responsepreliminary user evaluation NPQL WeST Steffen Staab [email protected] 38
  • 38. Agenda LiteQ – Language integrated types, extensions and queries for RDF graphs  Exploring  Programming, Typing Evaluation of LITEQ (NPQL) against SPARQL Understandability Ease of use SchemEX Construction of schema-based index Schema induction WeST Steffen Staab [email protected] 39
  • 39. Searching the LOD cloud SELECT ?x foaf:Document WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . x ?y rdf:type fb:Computer_Scientist } ? WeST Steffen Staab [email protected] 40 swrc:InProceedings fb:Computer_Scientist dc:creator
  • 40. Searching the LOD cloud SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings . ?x dc:creator ?y . ?y rdf:type fb:Computer_Scientist } Index WeST Steffen Staab [email protected] 41 • ACM • DBLP
  • 41. Schema-level index Schema information on LOD Explicit Implicit Assigning class types Modelling attributes Class rdf:type Property Entity 2 Entity Entity WeST Steffen Staab [email protected] 42
  • 43. Typecluster  Entities with the same Set of types C1 C2 ... Cn ... DSm TCj DS1 WeST Steffen Staab [email protected] DS2 44
  • 45. Bi-Simulation  Entities are equivalent, if they refer with the same attributes to equivalent entities  Restriction: 1-Bi-Simulation P1 P2 ... Pn ... DSm BSi DS1 WeST Steffen Staab [email protected] DS2 46
  • 47. SchemEX: Combination TC and Bi-Simulation  Partition of TC based on 1-Bi-Simulation with restrictions on the destination TC Schema C1 Payload ... Cn C45 C2 TCj ... Cn„ TCk BSi EQC WeST C2 DS1 EQCj DS2 P1 P2 ... Pn ... DSm Steffen Staab [email protected] EQC DS 48
  • 48. SchemEX: Example foaf:Document swrc:InProceedings fb:Computer_Scientist tc2309 tc2101 bs260 8 eqc707 DBLP WeST Steffen Staab [email protected] dc:creator ... SELECT ?x WHERE { ?x rdf:type foaf:Document . ?x rdf:type swrc:InProceedings ?x dc:creator ?y . ?y rdf:type fb:Computer_Scient } 49
  • 49. SchemEX: Computation  Precise computation: Brute-Force Schema C1 Payload ... Cn C12 C2 TCj ... Cn„ TCk BSi EQC WeST C2 DS1 EQCj DS2 P1 P2 ... Pn ... DSm Steffen Staab [email protected] EQC DS 50
  • 50. Stream-based Computation of SchemEX  LOD Crawler: Stream of n-Quads (triple + data source) … Q16, Q15, Q14, Q13, Q12, Q11, Q10, Q9, Q8, Q7, Q6, Q5, Q4, Q3, Q2, Q1 FiFo 1 C3 4 C2 3 6 4 2 C2 2 3 1 5 C1 WeST Steffen Staab [email protected] 51
  • 51. Quality of Approximated Index Stream-based computation vs. brute force Data set of 11 Mio. tripel WeST Steffen Staab [email protected] 52
  • 52. SchemEX @ BTC 2011 SchemEX Allows complex queries (Star, Chain) Scalable computation High quality Index over BTC 2011 data 2.17 billion tripel Index: 55 million tripel Commodity hardware VM: 1 Core, 4 GB RAM Throughput: 39.500 tripel / second Computation of full index: 15h WeST Steffen Staab [email protected] 53
  • 53. Future work wrt SchemEX Further exploration of • schema induction • query federation Federation vs Link Traversal based query execution • Granularity of query execution • Too fine grained: URI dereferencing • Too expressive: SPARQL • Sweet spot -> NPQL?? WeST Steffen Staab [email protected] 54
  • 54. Agenda LiteQ – Language integrated types, extensions and queries for RDF graphs  Exploring  Programming, Typing Evaluation of LITEQ (NPQL) against SPARQL Understandability Ease of use SchemEX Construction of schema-based index Schema induction WeST Steffen Staab [email protected] 55
  • 55. Future 1. 2. 3. 4. Searching for distributed data Understanding distributed data Intelligent queries on distributed data Programming with distributed data • Type reuse • Type induction WeST Steffen Staab [email protected] 56
  • 56. Web Science & Technologies University of Koblenz ▪ Landau, Germany Thank you for your attention!