eswc



eswc

0 0


eswc

ESWC presentation about RDF ORM

On Github MakoLab / eswc

Consuming RDF data

OOP-way

makolab.github.io/eswc

Karol Szczepański, Makolab S.A.

Tomasz Pluskiewicz, PGS Software S.A.

The disease

Working with RDF data is not an easy task.

Either you've got to comply with the graph representation dropping all candies that the language you use gives (i.e. operators, inheritance, etc. in C#)...

... or you've got to switch to the language that has better support.

RDF graph is not an easy one itself and you've got to know i.e. SPARQL to query it. Yet another two more shortcuts to learn.

The cure

There is still hope!

There are ways that can ease the pain or even remove it!

Consider an example of portal.ChemicalSemantics.com. This portal provides a platform where computational chemistry scientists can publish their results using RDF data sets build with app related plugins.

There are several complex types like molecular systems, molecules, residues and so on with various, sometimes cyclic relations.

Romantic Web

Quick look at the ontology resulted in few data contracts that mapped all necessary types and relations into interfaces and properties.

With either:

  • attribute - clean and readable
  • fluent-like API mapping - gives more control and improves maintainability

whole process enables ordinary C# codes to read, write and query RDF data!

</pub/smith-20150320174958/molSys/> a gc:MolecularSystem ;
    rdfs:label "Molecular System" ;
    dc:isPartOf </pub/smith-20150320174958/> ;
    gc:hasCalculationOn </pub/smith-20150320174958/mol-calc/> ;
    gc:hasSystemCharge [ a gc:FloatValue ;
        rdfs:label "System Charge" ;
        gc:hasFloatValue "0" ;
        gc:hasUnit gc:atomicUnit ] ;
    gc:hasSystemMultiplicity [ a gc:IntegerValue ;
        rdfs:label "System Multiplicity" ;
        gc:hasIntegerValue "1" ] ;
    gc:hasSystemTemperature [ a gc:FloatValue ;
        rdfs:label "Temperature" ;
        gc:hasFloatValue "0" ;
        gc:hasUnit unit:kelvin ] ;
    gc:holds </pub/smith-20150320174958/molSys/m1/> ;
    rdfs:comment "This molecular system has been generated by HyperChem and converted by cml2rdf." .
</pub/smith-20150320174958/> a gc:ComputationalChemistryPublication ;
    rdfs:label "Publication" ;
    dc:creator [ a dc:Agent ;
        schema:worksFor "organxyz" ;
        foaf:mbox "auth@hyper.com" ;
        foaf:name "authorxyz" ] ;
    dc:title "water pm3 vib" ;
    gc:hasInchiiKey [ a gc:stringValue ;
        rdfs:label "INCHII Key" ;
        gc:stringValue "n/a" ] ;
    gc:hasInput [ a gc:stringValue ;
        rdfs:label "Input" ;
        gc:stringValue "NA" ] ;
    gc:hasOrganization [ a gc:stringValue ;
        rdfs:label "Organization" ;
        gc:stringValue "organxyz" ] ;
    gc:hasOutput [ a gc:stringValue ;
        rdfs:label "Output" ;
        gc:stringValue "NA" ] ;
        gc:hasPublisher </pub/usr/smith/> ;
    gc:hasSource "HyperChem" ;
    gc:hasTag "pm3", "semi" ;
    gc:hasVisibility [ a gc:stringValue ;
        rdfs:label "Visibility" ;
        gc:stringValue "n/a" ] ;
    rdfs:comment "This is publication of computational chemistry result." .
</pub/smith-20150320174958/> a gc:ComputationalChemistryPublication ;
    rdfs:label "Publication" ;
    dc:creator [ a dc:Agent ;
        schema:worksFor "organxyz" ;
        foaf:mbox "auth@hyper.com" ;
        foaf:name "authorxyz" ] ;
    dc:title "water pm3 vib" ;
    gc:hasInchiiKey [ a gc:stringValue ;
        rdfs:label "INCHII Key" ;
        gc:stringValue "n/a" ] ;
    gc:hasInput [ a gc:stringValue ;
        rdfs:label "Input" ;
        gc:stringValue "NA" ] ;
    gc:hasOrganization [ a gc:stringValue ;
        rdfs:label "Organization" ;
        gc:stringValue "organxyz" ] ;
    gc:hasOutput [ a gc:stringValue ;
        rdfs:label "Output" ;
        gc:stringValue "NA" ] ;
        gc:hasPublisher </pub/usr/smith/> ;
    gc:hasSource "HyperChem" ;
    gc:hasTag "pm3", "semi" ;
    gc:hasVisibility [ a gc:stringValue ;
        rdfs:label "Visibility" ;
        gc:stringValue "n/a" ] ;
    rdfs:comment "This is publication of computational chemistry result." .

Few C# data contracts

[Class("gc", "MolecularSystem")]
public interface IMolecularSystem : IEntityWithLabel
{
    [Property("dcterms", "isPartOf")]
    IComputationalChemistryPublication Publication { get; set; }

    [Property("gc", "hasSystemCharge")]
    IFloatValue SystemCharge { get; set; }

    [Property("gc", "hasSystemMultiplicity")]
    IIntegerValue Multiplicity { get; set; }

    [Collection("gc", "holds")]
    IEnumerable<IMolecule> Holds { get; }
}
[Class("gc", "Molecule")]
public interface IMolecule : IEntityWithLabel
{
    [Collection("gc", "hasAtom")]
    ICollection<IAtom> Atoms { get; }

    [Collection("gc", "hasResidue")]
    ICollection<IResidue> Residues { get; }
}
[Class("gc", "Atom")]
public interface IAtom : IEntityWithLabel, IEntityWithName
{
    [Property("gc", "isElement")]
    string Element { get; set; }

    [Property("gc", "hasCoordinates")]
    IVectorValue Coordinates { get; set; }

    [Collection("gc", "hasBond")]
    IEnumerable<IBond> Bonds { get; }
}

LINQ query for atoms matching 'O' letter

var atoms = from atom in entityContext.AsQueryable<IAtom>()
            where Regex.IsMatch(atom.Label, "o")
            select atom;
foreach (var atom in atoms)
{
    Console.WriteLine(atom.Label);
}

... and it's corresponging SPARQL query

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?s ?p ?o ?Gatom0 ?atom0
WHERE
{
    GRAPH ?Gatom0
    {
        ?atom0 a <http://purl.org/gc/Atom> .
        ?atom0 <http://www.w3.org/2000/01/rdf-schema#label> ?label0 .
        FILTER(REGEX(?label0,"o"))  .
        ?s ?p ?o .
    }
    GRAPH <meta://graph/>
    {
        ?Gatom0 <http://xmlns.com/foaf/0.1/primaryTopic> ?atom0 .
    }
}

LINQ query for functional groups of type O-H

var groups = from molecule in entityContext.AsQueryable<IMolecule>()
    from atom in molecule.Atoms
    where atom.Element == "O"
    from bond in atom.Bonds
    from another in entityContext.AsQueryable<IMolecule>()
    from anotherAtom in another.Atoms
    where anotherAtom.Element == "H"
    from anotherBond in anotherAtom.Bonds
    where bond.Id == anotherBond.Id
    select molecule;
foreach (var group in groups)
{
    Console.WriteLine(group.Label);
}

... and it's corresponding SPARQL query

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?s ?p ?o ?Gmolecule0 ?molecule0
WHERE
{
    GRAPH ?Gmolecule0 {
        ?molecule0 a <http://purl.org/gc/Molecule> .
        ?molecule0 <http://purl.org/gc/hasAtom> ?atoms0 .
        ?s ?p ?o .
    }
    GRAPH <meta://graph/> { ?Gmolecule0 <http://xmlns.com/foaf/0.1/primaryTopic> ?molecule0 . }
    GRAPH ?Gatoms0 {
        ?atoms0 a <http://purl.org/gc/Atom> .
        ?atoms0 <http://purl.org/gc/isElement> ?element0 .
        FILTER(?element0 = "O"^^xsd:string)  .
        ?atoms0 <http://purl.org/gc/hasBond> ?bonds0 .
    }
    GRAPH ?Gbonds0 {
        ?bonds0 a <http://purl.org/gc/NormalBond> .
        FILTER(?bonds0 = ?bonds1)
    }
    GRAPH ?Ganother1 {
        ?another1 a <http://purl.org/gc/Molecule> .
        ?another1 <http://purl.org/gc/hasAtom> ?atoms1 .
    }
    GRAPH ?Gatoms1 {
        ?atoms1 a <http://purl.org/gc/Atom> .
        ?atoms1 <http://purl.org/gc/isElement> ?element1 .
        FILTER(?element1 = "H"^^xsd:string)  .
        ?atoms1 <http://purl.org/gc/hasBond> ?bonds1 .
    }
    GRAPH ?Gbonds1 {
        ?bonds1 a <http://purl.org/gc/NormalBond> .
        FILTER(?bonds0 = ?bonds1)
    }
}

What are the costs and drawbacks:

  • requires a meta-graph that points to which graph contains which non-blank resources
  • the infrastructure behind the scenes is quite heavy
  • SPARQL queries are auto-generated, thus may be not that optimal
  • issues with multiple instance references of a blank node in multiple graphs

JSON-LD Entities

Why JsonLd.Entites?

Romantic Web shortcomings:

  • enforces certain graph structure
  • practical only with triple store or SPARQL endpoint
  • fairly complex implementation

Why JsonLd.Entites?

Static language habits:

  • .NET devs want POCO objects
    • not dynamic
    • not low-level triple handling

JSON-LD Entities

Convention-based RDF to C# mapping

Simple ConventionsMatch property names (@id => Id) Serialization requires @context

JSON-LD Entities

Convention-based RDF to C# mapping

Handle JSON-LD constructs
  • @value
  • @sets and @lists
@frame to deserialize graphs of objects

JSON-LD Entities

Convention-based RDF to C# mapping

{
  "@context": {
     "foaf": "http://xmlns.com/foaf/0.1/",
     "name": "foaf:name",
     "lastName": "foaf:familyName",
     "Person": "foaf:Person",
     "interests": "http://dbpedia.org/property/interests"
  },
  "@id": "http://t-code.pl/#tomasz",
  "@type": "Person",
  "name": "Tomasz",
  "lastName": { "@value": "Pluskiewicz" },
  "interests": [ "RDF", ".NET" ]
}

(Other formats are transformed to JSON-LD first)

public class Person
{
    public Uri Id { get; set; }

    public string Name { get; set; }

    public string LastName { get; set; }

    public string[] Interests { get; set; }
}

JsonLd.Entities

  • JSON-LD bridges the gap between code and RDF
  • Simple implementation
    • Can be replicated in any language
  • Use instead of R2RML
  • No query capabilty
  • Requires existing graphs

Thank you

More at:

JsonLd.Entities

Romantic Web