HackathonNotes

From Semantic Web with Perl Wiki
Jump to: navigation, search
  • Accepting Trine Resource Nodes as the datatype argument of the Trine literal constructor; already done it seems
  • Serialize to whatever based on arguments; test implementation
  • Remodel how we work with Statements; toby's proof of concept
    • If you use e.g. a quad parser on triples, it should blow up.

Contents

Roles

  • Store
    • RDF::Trine::Store::API
    • RDF::Trine::Store::API::TripleStore
    • RDF::Trine::Store::API::QuadStore
    • RDF::Trine::Store::API::Readable
    • RDF::Trine::Store::API::Writeable
    • RDF::Trine::Store::API::BulkOps
    • RDF::Trine::Store::API::Pattern
    • RDF::Trine::Store::API::SPARQL
    • RDF::Trine::Store::API::ETags
    • RDF::Trine::Store::API::StableBlankNodes

Dependencies

Dependencies we've used. Need to keep track of these to keep them under control!

  • Moose
  • MooseX::Aliases
  • MooseX::Role::Parameterized
  • MooseX::Singleton
  • namespace::autoclean
  • MooseX::Types
  • MooseX::Types::Moose
  • MooseX::Types::URI
  • MooseX::Types::Path::Class

API Differences

Nodes

  • The RDF::Trine::Node abstract class is gone!
    • The compare and from_sse subs are available in RDF::Trine::Node::API
  • RDF::Trine::Node::Resource, RDF::Trine::Node::Blank, RDF::Trine::Node::Literal, etc are now immutable. You can't use, e.g. literal_value or uri_value as setters.
  • If you want to construct a resource using a base URI, then use RDF::Trine::Node::Resource->new_with_base
  • If you want to construct a canonical literal, then use RDF::Trine::Node::Resource->new_canonical. The old way (of passing a true fourth parameter) does still work but is deprecated.

Statements

  • The RDF::Trine::Statement abstract class is gone!
    • The constructor should still work, but will issue a warning.
  • RDF::Trine::Statement::Triple and RDF::Trine::Statement::Quad are the main classes; neither is a subclass of the other.
  • context is now graph. context still exists as an alias.

Stores

  • The RDF::Trine::Store is no longer a base class. (It still has some delegating constructors, though.)
    • Most of the existing stores now do the various RDF::Trine::Store::API roles.

Formats Registry for Parsers and Serializers

[kasei] The format registry needs to allow parsers and serializers to register themselves with relevant metadata, and allow client code to access parsers and serializers (or the respective class names) based on some criteria. Metadata about parsers and serializers must include:

  • short name ("turtle")
  • canonical media type ("text/turtle")
  • other media types ("application/x-turtle")
  • file extensions (".ttl")
  • format URI ("http://www.w3.org/ns/formats/Turtle")
  • if they can model quads

The registry should provide methods to construct parsers and serializers given any one of these values (utilizing some mechanism to choose between two classes that handle the same format) and provide a method to implement content negotiation for constructing serializers. Existing code that attempts to do this are the new, parser_by_media_type, serializer_names, and negotiate methods in RDF::Trine::Serializer (and similar methods in RDF::Trine::Parser).

[kba] Parser/Serializer Implementations could be in Perl, in an external library or delegated elsewhere (e.g. letting the remote store handle data ingestion). Implementations should have metadata about

  • speed
  • quality of processing
  • if they can handle quads ([kasei] see note below)
  • if they can handle streaming input/output streamingly
  • [kasei] if they can populate or consume data from a NamespaceMap

[kasei] Instead of assuming a parser/serializer defaults to triples but may also handle quads, should we consider generalizing these classes to be parsers/serializers of *any* format, and provide metadata to indicate what it is that is being parsed/serialized? Examples are: triples, quads, variable bindings (the various SPARQL result syntaxes)

[kasei] It would also be helpful to have roles indicating parsers/serializers that can handle single statements or nodes. For example, it is often useful to be able to serialize a single node in the turtle format for use in text output. There's an RDF::Trine::Serializer::Turtle->serialize_node method to do just that.

Registry mechanics

  • Use global package variables in the Registry (like it's done now)
  • Have the user provide a list of formats (configuration file with sensible hard-coded defaults)
  • Have the user specify preferences (speed over quality, triples-only etc.)
  • Do something like Dist::Zilla does with Plugins and Bundles
  • Namespace organization
   RDF::Trine::Format::RDFXML
   RDF::Trine::Format::RDFXML::Serializer::Default
   RDF::Trine::Format::RDFXML::Parser::XMLSAX
   RDF::Trine::Format::RDFXML::Parser::LibXML

[kasei] this naming format has problems for parsers/serializers that can handle multiple formats like redland (and possibly serd in the future)


  • Toby's suggested namespace organization ([kasei] +1)
   RDF::Trine::FormatRegistry
   RDF::Trine::Format
   RDF::Trine::Parser::RDFXML
   RDF::Trine::Parser::RDFXML::Another
   RDF::Trine::Serializer::RDFXML

[kasei] I would prefer a solution that does not require the user to specify any configuration in the default case (make the supplying of format lists and user preferences strictly optional).

TODO

  • Fix RDF::Trine::Store::DBI so that instead of re-blessing the object when using a recognized database (e.g. blessing into RDF::Trine::Store::DBI::mysql), the code applies a database-specific role to the store object (e.g. make RDF::Trine::Store::DBI::mysql a role and apply it to the store)
  • Convert use of Error to TryCatch and Throwable::Error
  • Update RDF::Query::Algebra::Triple to inherit from RDF::Trine::Statement::Triple instead of RDF::Trine::Statement
  • Make sure documentation for changed classes (including all Store classes) still makes sense for Moose (e.g. update discussion of inheritance to talk about roles)
  • Improve error handling in Ruben's new parsers (NTriples and Turtle) to include contextual information (the content that caused the error, line numbers, etc.)