Author: | Lars Heuer |
---|---|
Organization: | Semagia |
Date: | 2010-09-30 |
Status: | Draft |
Table of Contents
The Compact RDF to Topic Maps Mapping Syntax (CRTM) is a language to describe an RDF [RDF] to Topic Maps [TMDM] mapping. It was designed to be compact and easy to write and read.
The RTM ([RTM]) specification defines an RDF vocabulary which describes how RDF triples are converted into Topic Maps constructs. While that approach certainly works and has its advantages (like embedding the mapping into the RDF source), people familiar with Topic Maps have to write the mapping in an RDF format and the mapping may become verbose. More importantly, it is not possible to reuse an existing mapping within another mapping.
CRTM provides the same expressiveness as [RTM] and offers some additional features like translating the optional language tag into scope and makes the reuse of mappings possible.
This language operates with the same algorithm as described by [RTM]; it's just an alternative, compact syntax (aside from the enhancements which are not supported by [RTM]).
Like [RTM], CRTM uses RDF predicates to translate triples to Topic Maps constructs. The general notation is:
predicate-list ':' topic-maps-equivalent
Where the predicate-list consists of one or more RDF predicates which should be mapped to Topic Maps.
The following sections assume that the prefix foaf was bound to the IRI http://xmlns.com/foaf/0.1/ and the prefix tmdm to the IRI http://psi.topicmaps.org/iso13250/model/.
Whitespace consists of one or more space (#x20) characters, carriage returns, line feeds, or tabs. Whitespace character are allowed everywhere to separate tokens (terminals and non-terminals).
Comments are introduced with the hash sign (#) and continue until the end of the current line. They are allowed everywhere where whitespaces are allowed:
# This is a comment foaf:name: tmdm:topic-name # This is also a comment
IRIs are enclosed into < and >:
<http://www.semagia.com/> # This is an IRI <http://ws.mappify.org/rdf2tm/> # This, too ;)
QNames are used to abbreviate IRIs, they can be used (nearly) everywhere, where an IRI is allowed. During deserialization, the IRI to which the prefix is bound is concatenated with the local part. The result of such a process is always an absolute IRI:
foaf:name: tmdm:topic-name # Equivalent statement: <http://xmlns.com/foaf/0.1/name>: <http://psi.topicmaps.org/iso13250/model/topic-name>
The scope of a Topic Maps statement is introduced by the @ sign followed by list of QNames or IRIs:
# Maps the foaf:nick to the default name type and uses foaf:nick # as scope foaf:nick: tmdm:topic-name @foaf:nick # Same as above, but assigns two themes foaf:nick: tmdm:topic-name @foaf:nick, <http://psi.example.org/nickname>
While RDF knows just one identity type, Topic Maps provides two 'strong' identities (subject identifiers and subject locators) and one 'weak' identity type, the item identifiers.
CRTM provides support for all three identity types: the object may be mapped either to a subject identifier, a subject locator, or an item identifier.
A mapping to a subject identifier is either expresses through the keyword subject-identifier or sid:
foaf:mbox: subject-identifier # Shotcut foaf:mbox: sid
Subject locators can be established through the keyword subject-locator or slo:
foaf:img: subject-locator # Shortcut foaf:img: slo
Item identifiers are created through the keyword item-identifier or iid:
foaf:mbox: item-identifier # Shotcut foaf:mbox: iid
Occurrences are created with the keyword occurrence or the shortcut occ:
# Maps the foaf:homepage to an occurrence foaf:homepage: occurrence # Shortcut foaf:homepage: occ
Optionally, a type and a scope may be specified:
# Overrides the foaf:homepage type with the specified type foaf:homepage: occurrence <http://psi.example.org/homepage> # Adding type and scope: foaf:homepage: occurrence ex:homepage @lang:en
If a type and / or scope is specified, the keywords occurrence and occ are optional:
foaf:homepage: <http://psi.example.org/homepage> foaf:homepage: ex:homepage @lang:en foaf:homepage: @lang:en
The notation for names is similar to the occurrence notation, except that the keyword is name. Alternatively, the hyphen - can be used to indicate that the RDF statement should be mapped to a topic name:
# Maps foaf:name to a topic name with the type foaf:name. foaf:name: name # Maps foaf:name to the default topic name type. foaf:name: - tmdm:topic-name # Maps foaf:nick to the default topic name type and adds foaf:nick to the scope foaf:nick: - tmdm:topic-name @foaf:nick # Equivalent statement; using the name keyword: foaf:nick: name tmdm:topic-name @foaf:nick
Associations can be created with the (optional) keywords association or assoc followed by an optional type (if the RDF predicate should not be used as association type) followed by the role types and an optional scope:
# Creates an association where the subject and object play a role of type ``foaf:Person`` foaf:knows: association (foaf:Person, foaf:Person) # Creates an association where the subject plays the ``ex:member`` role # and the object the ``ex:group`` role. ex:member-of: assoc(ex:member, ex:group)
The association keyword is optional:
# Same example as above ex:member-of: (ex:member, ex:group)
Further, the association type can be overriden:
# The resulting association will have the type ex:parent-of rather than # ex:child-of ex:child-of: ex:parent-of(ex:child, ex:parent)
While it is possible to model type-instance relationships with the notation for associations, CRTM provides a shortcut for it:
# Maps rdf:type to a type-instance association where the # subject plays the tmdm:instance role and the object the tmdm:type role. rdf:type: isa
Like associations, the type-instance shortcut accepts an optional scope.
Occurrences and names can utilize the optional RDF language tag which is translated into a topic which uses one of the OASIS ISO 639 PSIs [OASIS]. By default, the RDF language tags are ignored (for compatibility with [RTM]).
If the RDF source provides language tags they can be added to the scope of the occurrence:
foaf:homepage: occurrence; lang=true
The lang=true instruction advices the CRTM reader to convert the (optional) language tag into a topic which is added to the occurrence scope.
Given the following [Turtle] statements:
ex:tinyTiM dc:description "tinyTiM is a small Topic Maps engine"@en . ex:tinyTiM dc:description "tinyTiM ist eine kleine Topic Maps Engine"@de .
the instruction:
dc:description: occurrence; lang=true
results into the following topic (using [CTM]):
%prefix lang <http://psi.oasis-open.org/iso/639/#> # Other prefixes omitted ex:tinytim dc:description: "tinyTiM is a small Topic Maps engine"@lang:eng; dc:description: "tinyTiM ist eine kleine Topic Maps Engine"@lang:deu.
CRTM offers also a global setting to translate the language tags and therefor it's also possible to disable the translation on a per statement basis:
dc:description: occurrence; lang=false
This would disable the translation of the language tag for the dc:description RDF predicate.
The prefix directive is used to associate an IRI with an identifier and to use QNames instead of more verbose IRIs.
Example:
%prefix ex <http://www.example.org/> ex:foo: subject-identifier
In the example above the QName ex:foo is expanded to http://www.example.org/foo.
The language to scope directive enables / disables the translation of the optional RDF language tag for all occurrences and names.
Example:
%langtoscope true dc:description: occurrence
Given the following [Turtle] statements:
ex:tinyTiM dc:description "tinyTiM is a small Topic Maps engine"@en . ex:tinyTiM dc:description "tinyTiM ist eine kleine Topic Maps Engine"@de .
the resulting [CTM] topic would be:
%prefix lang <http://psi.oasis-open.org/iso/639/#> # Other prefixes omitted ex:tinytim dc:description: "tinyTiM is a small Topic Maps engine"@lang:eng; dc:description: "tinyTiM ist eine kleine Topic Maps Engine"@lang:deu.
Each mapping may override the global setting with setting lang to false:
%langtoscope true dc:description: occurrence; lang=false
Results into:
ex:tinytim dc:description: "tinyTiM is a small Topic Maps engine"; dc:description: "tinyTiM ist eine kleine Topic Maps Engine".
To create modular mappings or to reuse other mappings, CRTM offers the include directive:
%include <foaf.crtm>
The referenced mapping is added to the current CRTM instance. The IRI of the referenced mapping is resolved against the document locator of the current CRTM instance.
To shorten the code further, CRTM offers predicate lists which should be mapped to a Topic Maps construct.
The following statements:
foaf:familyName: name foaf:firstName: name foaf:givenName: name
can be folded into one statement:
foaf:familyName, foaf:firstName, foaf:givenName: name
That notation works for all CRTM instructions:
foaf:homepage, foaf:workInfoHomepage: subject-identifier
If a mapping is dedicated to a particular domain (i.e. the FOAF vocabulary), it could be cumbersome to type always QNames, therefor CRTM offers an alternative syntax:
%prefix foaf <http://xmlns.com/foaf/0.1/> foaf { name: name nick: name }
The identifiers within the curly braces are interpreted as local part of the a QName which the prefix "foaf". The grouped statements are not limited to one prefix, though:
%prefix doap <http://usefulinc.com/ns/doap#> %prefix foaf <http://xmlns.com/foaf/0.1/> doap { shortdesc, description: occurrence } foaf { name: name nick: name }
As shown in the example above, it's also possible to use a list of identifers which should be mapped to a Topic Maps statement (doap:shortdesc and doap:description).
This section shows a complete, example mapping:
# # CRTM example that maps a subset of the DOAP voc. to Topic Maps # %prefix doap <http://usefulinc.com/ns/doap#> %prefix tmdm <http://psi.topicmaps.org/iso13250/model/> %prefix rdf <http://www.w3.org/1999/02/22-rdf-syntax-ns#> %prefix foaf <http://xmlns.com/foaf/0.1/> %prefix ex <http://psi.example.org/doap/> rdf:type: isa foaf { # Map the foaf:name type to the default name type. name: - tmdm:topic-name homepage: occurrence } doap { # Map the doap:name type to the default name type. name: - tmdm:topic-name shortname: name # Map all of the following DOAP properties to occurrences shortdesc, description, homepage, download-page, bug-database, mailing-list, license, programming-language, browse: occurrence # Create an association from the repository property repository: ex:has-repository(ex:project, ex:repository) # Create an assoc from the maintainer property maintainer: ex:maintains(ex:project, ex:maintainer) # Treat the repository URL as subject locator. location: subject-locator }
The Compact RTM grammar is defined as follows (using EBNF defined in XML 1.0 (Third Edition) [EBNF]).
Note
CRTM has no reserved words, all keywords may be used as identifier.
instance ::= directive* (statement | prefix)* directive ::= include | prefix | lang-to-scope prefix ::= '%prefix' IDENT IRI include ::= '%include' IRI lang-to-scope ::= '%langtoscope' boolean statement ::= grouped-statement | single-statement grouped-statement ::= (IDENT | IRI) '{' (local_ids ':' statement-body)+ '}' local_ids ::= LOCAL_IDENT (',' LOCAL_IDENT)* single-statement ::= predicates ':' statement-body statment-body ::= identity | type-of | subtype-of | association | occurrence | name predicates ::= qiri (',' qiri)* identity ::= sid | slo | iid sid ::= 'subject-identifier' | 'sid' slo ::= 'subject-locator' | 'slo' iid ::= 'item-identifier' | 'iid' type-of ::= 'isa' scope? subtype-of ::= 'ako' scope? association ::= ('association' | 'assoc')? type? roles scope? roles ::= '(' subject-role ',' object-role ')' subject-role ::= qiri object-role ::= qiri occurrence ::= ('occurrence' | 'occ') type? scope? language? | type scope? language? | scope language? name ::= ('name' | '-') type? scope? language? language ::= ';' 'lang' '=' boolean type ::= qiri scope ::= '@' theme (',' theme)* theme ::= qiri qiri ::= QNAME | IRI boolean ::= 'true' | 'false' IDENT ::= ID_START (.* ID_CHAR)* LOCAL_IDENT ::= IDENT | ([0-9]+ (\.* ID_CHAR)*) QNAME ::= IDENT ':' LOCAL_IDENT IRI ::= '<' [^<>"{}`\ ]+ '>' COMMENT ::= '#' [^#xA#xD]* ID_START ::= [a-zA-Z_] | [\u00C0-\u00D6] | [\u00D8-\u00F6] | [\u00F8-\u02FF] | [\u0370-\u037D] | [\u037F-\u1FFF] | [\u200C-\u200D] | [\u2070-\u218F] | [\u2C00-\u2FEF] | [\u3001-\uD7FF] | [\uF900-\uFDCF] | [\uFDF0-\uFFFD] | [\u10000-\uEFFFF] ID_CHAR ::= ID_START | [-.0-9] | \u00B7 | [\u0300-\u036F] | [\u203F-\u2040]
[RDF] | Resource Description Framework (RDF): Concepts and Abstract Syntax, W3C, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/rdf-concepts/ |
[TMDM] | ISO/IEC 13250-2: Topic Maps — Data Model (TMDM), 2006, http://www.isotopicmaps.org/sam/sam-model/2008-06-03/ |
[RTM] | The RTM RDF to Topic Maps mapping, Ontopia A/S, 2003, http://www.ontopia.net/topicmaps/materials/rdf2tm.html |
[Turtle] | Turtle - Terse RDF Triple Language, David Beckett, 2008, http://en.wikipedia.org/wiki/Turtle_%28syntax%29 |
[CTM] | ISO/IEC 13250-6: Topic Maps — Compact Syntax (CTM), http://www.isotopicmaps.org/ctm/ |
[OASIS] | OASIS PubSubj TC, Published subjects for languages in ISO 639, http://psi.oasis-open.org/iso/639/ |
[FOAF] | FOAF Vocabulary Specification, 2010, 3rd edition, http://xmlns.com/foaf/spec/20100101.html |
[IRI] | IETF RFC 3987, Internationalized Resource Identifiers (IRIs), Internet Standards Track Specification, January 2005, http://www.ietf.org/rfc/rfc3987.txt |
[EBNF] | XML 1.0, Extensible Markup Language (XML) 1.0, W3C, Third Edition, W3C Recommendation, 04 February 2004, http://www.w3.org/TR/REC-xml/ |