Tumbled Logic

17.11.2013 19:23:00 ¶ ●

Modelling activity streams with quads

I’ve been thinking for a while about how one might use a quad-store as a database of activity streams, and consequentially how to model them.

We have four key pieces of information, plus a set of ancillary properties. These are:

A unique, generated identifier for the action, so that we can refer to it later
The person (or agent) performing the action
The action being performed (verb)
The thing the action is being performed upon

On top of these, we might have various other pieces of information: the action’s timestamp, the system it was performed upon, a policy associated with it (such as whether it can be shared anonymously), and so on.

The thought occurs that the verb could be easily represented as a predicate (previous approaches that I’ve seen represent the verb as an instance of a class instead, but my gut instinct is that it makes dealing with the data harder than it need to be).

Now, it’s entirely up to you which way around you put the agent or the action: but I’ve opted to use the agent as the subject (although that does affect the way I name my terms — I’ve opted for watched instead of watchedBy, for example).

Finally, the identifier for the action is used as the graph name, but is also annotated with additional properties in a separate graph controlled by the activity store itself. This is, I’ll admit, not the usual way to approach named graphs, but it does mean that when you reduce down to triples, you’re left with a stream of the three most important pieces of information: agent, verb, thing.

Enough waffling. Here’s an example:


@prefix act: <http://example.com/activity/>
@prefix dct: <http://purl.org/dc/terms/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix event: <http://purl.org/NET/c4dm/event.owl#>
@prefix foaf: <http://xmlns.com/foaf/0.1/>
@prefix tl: <http://purl.org/NET/c4dm/timeline.owl#>

</2013/11/17/19/13/38078878b361481a9a05210d6611b89a>
    foaf:PrimaryTopic </2013/11/17/19/13/38078878b361481a9a05210d6611b89a#id> .

</2013/11/17/19/13/38078878b361481a9a05210d6611b89a#id> {
    <http://neva.li/#me> act:watched <http://www.bbc.co.uk/programmes/b03hy7hm#programme> .
}

</#id> {
  </2013/11/17/19/13/38078878b361481a9a05210d6611b89a#id>
    a event:Event ;
    event:time [
        a tl:Interval ;
        tl:start "2013-11-17T18:25:33Z"^^xsd:dateTime ;
        tl:duration "PT48M14S"^^xsd:duration
    ] ;
    dct:created "2013-11-17T19:13:47Z"^^xsd:dateTime ;
    dct:source <http://www.bbc.co.uk/iplayer/#id> .
}

You’ll note that my activity identifiers contain a date and time (down to the minute level, at least) and a UUID: this is to allow for both logical navigation patterns to retrieve aggregations (e.g., “return all activity from November 2013”), while not being reliant upon a single naming authority. There are, of course, a whole host of different ways of doing this.

Thoughts?

Update: Ryan Adams points me at Tin Can, which I must have seen when it was announced, but haven’t actually looked at the spec for until after posting this. Interestingly, it looks like it would be fairly straightforward to express Tin Can statements in the form above (and even easier if the experience verbs had an RDF vocab representation).