Managing digital rights in the publishing world

When people talk about Digital Rights Management, or DRM, the real subject of their discussion is usually Digital Rights Enforcement. The basic use case seems to be how to prevent a teenage boy from duping "Rush Hour II" for his friends, or some variation thereof—what you add to the DVD storing the movie, what you add to the players to check on the DRE components of the DVD, and so forth.

The management of digital rights, as opposed to their enforcement, is a real problem in the publishing industry, but discussion is usually drowned out by the shouting matches about digital rights enforcement. Here's a typical use case: an editor wants to take an article with six pictures from her magazine's print edition and put it online. Two of the pictures come from a cookbook being reviewed by the article, two were shot for the article by a freelancer, and two come from a stock photo house. Which images can the editor use in the online version?

Does anyone know of a straightforward standard for a content provider to specify re-use rights for a work to a publishing industry business partner?

Unlike the mythical teenage boy, a publisher has an ongoing relationship to maintain with the content suppliers and doesn't want to jeopardize any of these relationships by avoiding extra payments. The difficulty is simply digging up the terms of re-use so that the editor knows which images are available to put with the article on the website. When looking this up takes too much time, it affects the publication schedule.

Whether you build or buy a system to track this, there are three basic approaches, but first, a note on software: there are vendors who will tell you "our fabulous product takes care of all that! Simply check in the pictures or other content and enter the re-use terms, and then you can look it up any time!" I'm not interested in this unless the software can read and write the re-use terms in a standard format whose specs are independent of the software. (As we'll see below, this is easier said than done.)

The devil is in the details of how you enter the re-use terms. Of the three approaches, the PRISM standard for magazine metadata offers variations on each, so I'll refer to that when I need examples, although only the second option below demonstrates a technique that is specific to PRISM.

Option 1: a slot for a natural language description

In this scenario, you write out one or more sentences describing what you can do with the work. This could be stored in a a relational database, a rights tracking package that you bought from a vendor, or in some XML. You may have the option of storing this inside the work itself, especially if it's in XML. The PRISM standard uses the Dublin Core rights field as the name of this element, which is a fine idea.

The problem here is that you leave it up to the person writing out the sentences to either accurately copy the agreement terms or to paraphrase them properly, and that leaves room for error.

Option 2: a set of fields to store specific re-use parameters

The good news: it's easier to automate the processing of this information when, for example, a system puts an image on a website. The bad news: what fields do you use to store the information you want to track? There are a lot of parameters in these agreements. What standards are out there? How well do they fit with your needs? When a stock photo house supplies images to a magazine, there will be one set of information to track; when that magazine supplies an illustrated article to an aggregator, that relationship will be governed by some of the same pieces of information, but also by some different ones.

PRISM has defined a few fields such as embargoDate and expirationDate for this. These can be stored inside of a dc:rights element, or anywhere else for that matter. If stored in a relational database, it would be useful metadata to indicate somewhere that the "embargoDate" field means what the PRISM standard says it means and not someone else's slightly different concept of the same term. (I know some people hate namespaces, but they're awfully useful sometimes...) The PRISM standard will not have everything you need, and I know that arriving at the few re-use rights fields that it does have was the result of a great deal of work.

Option 3: point to an image of the official document

This is my favorite in terms of bang for buck, because the lack of data entry involved means less work and less room for error, and there's no issue about which collection of information fields to track. If you already have your images in a Digital Asset Management system, then tracking images of the agreements that govern those images won't put much extra strain on your system. Scan the agreement and give it an identifier. The PRISM Introduction document includes the following example to identify the file with the contract governing the use of the image Corfu.jpg:

  <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> 
    <dc:rights rdf:resource="http://PhillyPhantasyPhotos.com/terms/Contract39283.doc"/> 
  </rdf:Description>

I consider a Word doc file a bit too mutable for an official record of a legal document, but the ID could just as easily point to a TIFF or JPG file of a scanned contract:

  <rdf:Description rdf:about="http://wanderlust.com/2000/08/Corfu.jpg"> 
    <dc:rights rdf:resource="http://somepath/in/our/intranet/Contract39283.jpg"/> 
  </rdf:Description>

The information need not be stored in RDF/XML either—you could put it in a relational database, a rights tracking package, XMP embedded in the image, or whatever you like—but you have to admit, RDF/XML isn't always an ugly mess, and what we see above is pretty straightforward.

Standards

As you've seen, PRISM is one to consider. I think that the OASIS XACML standard also looks good (pcmag.com has a nice brief summary of what XACML IS about; wouldn't it be great if the home page for each OASIS standard had such a summary?), and an open source XACML engine has just appeared on Google Code, but XACML's flexibility may have turned it into something that's a bit too abstract for people in the publishing industry. Although it's been around for at least five years, the existence of free working code could now lay the groundwork for someone to build something specific to the publishing industry's needs.

At the Online Information 2007 conference and trade show in London in December, I first heard about ACAP. This nascent standard seems more concerned with standardizing content access policies for web crawlers and search engines than in publishing industry B2B relationships, but as with many standards that are related to your interests, an ACAP advocate that I met said "you could use it for that too!"

Does anyone know of a straightforward standard for a content provider to specify re-use rights for a work (an image, text content, or a combination like my magazine article example) to a publishing industry business partner?

Comments

(Note: I usually close comments for an entry a few weeks after posting it to avoid comment spam.)

Take a look at http://www.editeur.org/onix_licensing.html. This is enjoying some traction.

I wouldn't call it straightforward though!

- Alex.

Posted by: Alex Brown | February 21, 2008 8:51 AM

Alex,

Thanks, it does look promising, and apparently I've even mentioned it before.

Bob

Posted by: Bob DuCharme | February 21, 2008 10:05 PM

There is the Copyright Ontology that is intended for Digital Rights Management (no enforcement).
And a PhD thesis devoted to it: A Semantic Web Approach to Digital Rights Management

Posted by: Roberto García | February 23, 2008 5:20 PM

Thanks Roberto! Is this ontology being used by any publishers?

Posted by: Bob DuCharme | February 23, 2008 7:56 PM

bobdc.blog

Bob DuCharme's weblog, mostly on technology for representing and linking information.