Abstract
Motivation Curation is essential for any data platform to maintain the quality of the data it provides. Existing databases, which require maintenance, and the amount of newly published information that needs to be surveyed, are growing rapidly. More efficient curation is often vital to keep up with this growth, requiring modern curation tools. However, curation interfaces are often complex and difficult to further develop. Furthermore, opportunities for experimentation with curation workflows may be lost due to a lack of development resources, or a reluctance to change sensitive production systems.
Results We propose a decoupled, modular and scriptable architecture to build curation tools on top of existing platforms. Instead of modifying the existing infrastructure, our architecture treats the existing platform as a black box and relies only on its public APIs and web application. As a decoupled program, the tool’s architecture gives more freedom to developers and curators. This added flexibility allows for quickly prototyping new curation workflows as well as adding all kinds of analysis around the data platform. The tool can also streamline and enhance the curator’s interaction with the web interface of the platform. We have implemented this design in cmd-iaso, a command-line curation tool for the identifiers.org registry.
Availability The cmd-iaso curation tool is implemented in Python 3.7+ and supports Linux, macOS and Windows. Its source code and documentation are freely available from https://github.com/identifiers-org/cmd-iaso. It is also published as a Docker container at https://hub.docker.com/r/identifiersorg/cmd-iaso.
Contact hhe{at}ebi.ac.uk
Competing Interest Statement
The authors have declared no competing interest.