RDFSemanticsToMoose

From Semantic Web with Perl Wiki
Jump to: navigation, search

Contents

Problem Statement

Objects are language-specific constructs that encapsulate attributes and methods. They are self-contained and hierarchical, with circular references being the extreme rather than the rule.

RDF on the other hand is a graph structure, with no clear boundaries between entities.

Introspection vs. Code Generation

There are two fundamental ways to map between RDF and Objects: One is to annotate the class to map attributes to RDF predicates and their values to RDF objects. A mapper can introspect the class definitions to generate RDF from existing objects or objects from existing RDF. The other approach is to introspect the schematic part of an ontology and create the code for Perl classes from the definitions. MooseX::Semantic does the former, OWL::Class the latter. They are both valid approaches, with advantages and disadvantages.

Pro/Con Introspection

  • closer coupling of business logic and data model: The developer can think in OO terms and keep attribute defintions and methods in the same place, adding support for RDF round-tripping with just a few adjustments.
  • Data Modelling has to be done in two places for larger ontologies: With tools like Protege, modelling the domain, and again by the programer who has to adapt her code to it. The two incarnations of the schema have to be kept in sync.
  • It is easy to adapt legacy code to be RDF-aware (as far as it is easy for legacy code to be moved to Moose...)

Pro/Con Code Generation

  • Data Modelling and Coding are kept separate, and data modelling doesn't require coding skills. The data model used in the application can be kept up-to-date by re-generating the code.
  • There are some hoops to jump through to keep attributes and methods of a class together, without overwriting the latter on re-generation of code and so on.
  • There is no inherent need to use Moose, as there is no introspection into Perl code necessary, which might improve performance.

Convergence Points

It is possible to combine the two approaches within MooseX::Semantic by using the MooseX::Semantic::Util::SchemaImport role. This is a parametrized role that gets some specifics about the ontology source and class in question and generates the Moose attributes on-the-fly at runtime.

MooseX::Semantic TODOs

Refactor that whole MooseX::Semantic::Util::TypeConstraintWalker insanity

  • Either do it like MooseX::Storage and not recurse at all
  • Or ask the Moose experts how to properly determine the classes / TypeConstraints of has-isa attributes
  • And make this extensible to people's needs (so it's possible to extend the mapping process for stuff like "Set::Object" or similar)

Use Data::Visitor

other

  • Refactor the MooseX::Semantic::Types module to a module of its own
    • Combine with the heuristics from RDF::Closure
  • SparqlUpdate export filter from RDF::Trine : Is it useful, is it correctly implemented? (I remember, there was an API rewrite in one of kasei's branches)
  • Improve import performance by deferring class instantiation to the latest possible time (using CodeRefs or making attributes 'lazy' or similar)
  • Discuss if backend integration / persistence should be part of MXS or not.
    • RdfBackend allows attaching an RDF::Trine::Store to a Moose class which is weird semantically, but convenient to use (just call $object->store).
    • RdfObsolescence is the idea that the MOP can be exploited to keep track of changed attributes and therefore changed statements, which can be useful for tracking change.

Distinct Functionalities

Serialization of Moose classes to RDF

This is best done by something that has a deep understanding of the Moose semantics - like KiokuDB. Maybe the best approach here would be to write an RDF backend for KiokuDB that collapses a KiokuDB::Entry into a RDF::Trine::Model.

However, additional information is necessary to do this right:

  • Mapping rdf:type -> Moose TypeConstraint

De-Serialization of RDF to Moose objects

For this, the system only should only minimal introspection into the Moose mechannics: It should be enough to pass scalar/literal values to the corresponding attributes and recursively create nested objects.

Information needed:

  • Typemap rdf:type -> Perl Package Name
  • Mapping RDF property name to Moose has-attribute name

De-Serialization of RDF schema data to Moose classes

KiokuDB and MooseX::Semantic =

What KiokuDB supports:

  • Serialization
    • Throw anything at it, optional with ID (otherwise a UUID is generated)
    • KiokuDB collapses this anything to a KiokuDB::Entry [1]
    • Serializes it using a KiokuDB::Serializer
    • Returns the ID
  • De-Serialization
    • Throw an ID to KiokuDB ($d->lookup($id))
    • KiokuDB retrieves it
    • Inflates memory from it using the KiokuDB::Linker [2]

Adding RDF support to KiokuDB could happen at [1] and [2] without much fiddling (and without much added value).

RDF-OO In other languages

Java

Python

Ruby

C++