Versioningprototype



Versioningprototype

0 0


versioning-prototype


On Github rskonnord-plos / versioning-prototype

Versioningprototype

Short-term goal

AOP stands for Ahead Of Print Advance Online Publication.

  • Want to eliminate the lag between an article passing review and being typeset.
  • Put a quick-and-dirty PDF up as early as possible, with the nice XML coming later.
  • Be transparent: keep the historical AOP version available on the website...
  • ...but make it clear that the good version is the canonical one.

Longer-term goal

Arbitrary set of versions for all content in the corpus.

  • Preserve history of ingested versions, like a wiki.
  • Serves the goal of early posting: even earlier than AOP, with a disclaimer that the article hasn't passed peer review yet.

Longer-term goal

More flexible notion of what constitutes an article.

  • Want a data model of polymorphic DOI-bearing things.
  • Serves the goal of early posting: even earlier than AOP, with a disclaimer that the article hasn't passed peer review yet.
  • An "article" might one day be only a data set or a single figure.

Limitations of old data model

  • One thing ↔ one name
  • Rigid two-layer tree of DOIs: articles and assets.
  • Terrible asset ID scheme.

The

beginnings of a
solution for
now, provided we don't have to change
everything

Versioned files

The Content Repo provides features for storing all versions of the articles' component files side-by-side.

  • They don't inherently come with distinct names.
  • Putting them in the CRepo generates a UUID pointer.

Scholarly works

A content-agnostic representation of a "DOI-bearing thing".

Associated with:

a DOI a type a collection of files

Relationships between works

A problem: we want a figure to stand on its own, but the source of truth for all of its metadata is the parent article's XML document.

More complex relationships may emerge one day—for example, an early-posted data set becomes an asset of a full-fledged article.

Relationships between works

Solution: relationship objects that link scholarly works.

Relationships are polymorphic like the works themselves. We defined an "asset-of" type that defines where to find a figure's metadata.

Revisions

The versions of an article that appear to readers are different from the history of ingested snapshots.

  • We've been through many iterations of requirements, and still aren't done.
  • For now, we expect that a revision number will be provided in the XML.
  • The revision number is associated with the last article that was ingested with it.
  • Matching revision numbers propagate to works representing the article's assets.

The near future

  • More prototyping
  • Polymorphic controllers
  • Legacy data types: when something belongs to "an article", does it belong to one version or all?
    • Comments
    • Metrics
    • Subject areas
    • Etc., etc., etc.
  • Performance
    • Dynamic metadata
    • XML datastore

Questions?

Versioningprototype