PIDapalooza 2018 has ended

Create Your Own Event

PIDapalooza 2018

Welcome to PIDapalooza 2018...where anything goes...as long as it goes on forever.

12:00pm CET

free, open data metrics for all

The Make Data Count (MDC) project aims to develop data-level-metrics and elevate data as a first class research output. Data usage metrics rely on persistent identifiers, but usage is different based on content type. DOIs are used for both publications and datasets but the underlying ways of tracking usage are different. Recognizing that metrics around PIDs are community specific, the MDC team has looked at the various community attributions around PIDs and are tackling this issue through developing standards and facilitating community involvement to drive adoption of data-level-metrics.

Speakers

John Chodacki

University of California Curation Center (UC3) Director, California Digital Library

John Chodacki is Director of the University of California Curation Center (UC3) at California Digital Library (CDL)

Trisha Cruse

Director, DataCite

Martin Fenner

Technical Director, DataCite

Daniella Lowenberg

Sr. Data Publishing Product Manager, University of California

Principal Investigator and lead of the Make Data Count initiative and Sr. Product Manager for Dryad... Read More →

Tuesday January 23, 2018 12:00pm - 12:30pm CET
Stage 1

Bridging worlds

12:00pm CET

[IDs/Repos] are doing it wrong: a debate

The future of open science lies in a distributed network of resources bolstered by core research infrastructures. Resources are created, housed and preserved at every level in the research ecosystem, but repositories, whether subject or institutional, are a crucial network guaranteeing access to and preservation of precious scholarly resources. Ensuring that these resources are discoverable, reusable and bringing recognition for their creators is at the heart of the repository community's drives to improve their services. Bringing PIDs into the heart of repository workflows and delivering their potential to increase the effectiveness and openness of research communication is essential. By marrying established PIDs, such as ORCID iDs, to popular repository platforms, like DSPace and Fedora, we can leverage the power of both networks, adding new PIDs to repositories and making more of the connections between resources visible. In this session, we will discuss the ways that repository and identifier systems need to evolve to improve discovery and reuse. Representatives from Duraspace and ORCID will debate the ways that their communities are working together to embed PIDs in repository workflows, and present their competing priorities and visions for standards and best practices for identifiers in the repository space.

Speakers

Josh Brown

Director, Partnerships, ORCID

Josh works with stakeholders, with a focus on research funders and our partners, to support understanding and engagement, and promote adoption of ORCID. He directs the operations of ORCID EU, leading the ORCID contribution to the THOR Project. He was previously the ORCID Regional... Read More →

Michele Mennielli

Sr Global Strategist, LYRASIS

Michele is International Outreach Representative at LYRASIS. His primary responsibility is to broaden and extend the organizational reach and strengthen global partnership in support of the preservation and accessibility of cultural heritage and academic resources globally. He’s... Read More →

Tuesday January 23, 2018 12:00pm - 12:30pm CET
Stage 3

Bridging worlds

2:00pm CET

All about that BASE

This conversation will be about how DOIs and ORCID iDs only recently entered BASE, one of the largest academic search engines, which happens to be non-commercial. BASE harvests bibliographic metadata via OAI-PMH from thousands of publication repositories – each of which has its own idea about Dublin Core, the lowest common denominator of metadata formats. So we normalize the data from each repository. Authors have been able to claim their own publications in BASE since mid-2017 by connecting them to their ORCID iD. It is an open research question how this linkage information could percolate back to the source repositories. Suggestions welcome!

Speakers

Christian Pietsch

Bielefeld University

Tuesday January 23, 2018 2:00pm - 2:30pm CET
Stage 3

Bridging worlds

2:30pm CET

How Portugal tackles Org IDs

This presentation aims to describe the Portuguese approach to manage organizational identifiers within the national/international ecosystem of research information system. It describes the goals, methodology , architecture and use cases of how bridge identifiers, in the case ISNI and Ringgold, have helped to break information systems silos providing added value to the stakeholders. The Org Ids project is part of PTCRIS (www.ptcris.pt) program which aims to define standards and build infrastructures that ensure the integration of information systems supporting scientific activity into a single, coherent and integrated ecosystem.

Speakers

Maria João Amante

ISCTE-IUL

Tuesday January 23, 2018 2:30pm - 3:00pm CET
Stage 3

Bridging worlds

2:30pm CET

Unidentifiable identifiers (and other perils of actually trying to build clever tools using PIDs)

Developers love PIDs! They provide trustworthy, centralized, reliable sources of information that are always up to date, which allows us to make better, faster applications that play nicely with others. But sometimes it’s not so easy to integrate PIDs. Without a central registry to provide current, machine-readable information about which PIDs are out there, their status, their format and their metadata APIs, integrating a broad range of PIDs into an application is a formidable challenge. For example, at ORCID, several staff members have been working for months to compile a comprehensive list of PID types for research outputs for use in our application interface, and, in some cases, we’ve resorted to emailing individuals at various organizations just to find the correct format for a particular PID. This session will discuss some of the challenges that application developers face in integrating a full range of PIDs, workarounds that we’re currently using, and the great joy that a centralized PID registry would bring.

Moderators

ORCID

ORCID

Speakers

Liz Krznarich

Adoption Manager, DataCite/ROR

Eric Olson

Engagement Lead, North America., ORCID

Eric supports ORCID members as they develop new and existing integrations and workflows. Before joining ORCID, Eric worked on the PressForward publishing software at the Roy Rosenzweig Center for History and New Media, where he recruited and trained research organizations to utilize... Read More →

Tuesday January 23, 2018 2:30pm - 3:00pm CET
Stage 1

Bridging worlds

3:30pm CET

PIDs, Information Types and Collections - a Research Data Framework

PIDs are a requirement for reliable research data reuse and data sharing. Standardized PID information types are valuable for the parametrization of data management workflows. And building collection is the natural way to reassemble previously established research data as source for new scientific results. These three together provide a lightweight, but rather comprehensive, interoperable research data framework across sites. This Framework becomes interoperable across sites and PID systems with the PID system agnostic RDA-definition of collections, and with the standardization of PID information types given by emerging Data Type Registries. This session will show the interdependencies of these concepts, describe the necessary technology to provide such an interoperable research data framework and discuss use cases.

Speakers

Ulrich Schwardmann

University of Göttingen

Tuesday January 23, 2018 3:30pm - 4:00pm CET
Stage 2

Bridging worlds

4:00pm CET

What questions can PIDs answer?

Metadata are the structured and standard subset of the documentation that is required to understand scientific datasets. Identifiers are the connections between these two worlds and make it possible for users to get the information they need to understand the data they discover. Let’s explore some questions that these identifiers can answer.

Speakers

Ted Habermann

Chief Game Changer, Metadata Game Changers

I am interested in all facets of metadata needed to discover, access, use, and understand data of any kind. Also evaluation and improvement of metadata collections, translation proofing. Ask me about the Metadata Game.

PID Questions pptx

Tuesday January 23, 2018 4:00pm - 4:30pm CET
Stage 2

Bridging worlds

10:00am CET

Metadata 2020: Harnessing PID-power for the greater good.

This session and discussion presents the work that Metadata 2020 is doing to help scholarly communities collaborate to achieve metadata sharing solutions. With this session, we aim to to advance discussions on how each community can best collaborate to solve specific issues. We will discuss the ways in which our efforts could be advanced, how to optimize usefulness to the wider scholarly communications community, and how our work can be crafted into an overarching Metadata Maturity Model.
Help us figure out how to connect the dots!

Speakers

John Chodacki

University of California Curation Center (UC3) Director, California Digital Library

John Chodacki is Director of the University of California Curation Center (UC3) at California Digital Library (CDL)

Chris Erdmann

Chief Strategist for Research Collaboration, Libraries, North Carolina State University

Ginny Hendricks

Director of Member & Community Outreach, Crossref

Since 2015, Ginny has been developing the member and community outreach team at Crossref encompassing outreach and education, user experience and support, and metadata strategy. She is the Instigator of the Metadata 2020 collaboration to advocate for richer, connected, reusable and... Read More →

Alice Meadows

Director, Communications, ORCID

Wednesday January 24, 2018 10:00am - 10:30am CET
Stage 1

Bridging worlds

10:30am CET

Open Refine cleans up messes

OpenRefine (formerly Google Refine) is a tool for cleaning up messy data, but it is also well-suited to cross-linking different identifiers through their metadata entries. I will share some experiences in using OpenRefine to interlink GRID, ISNI, Wikidata, and our own internal institution identifiers. This matching process also highlights metadata errors (including duplicates and incorrect merges) in the different sources, and can help improve data quality on all sides.

Speakers

Arthur Smith

American Physical Society

Arthur Smith has worked for the American Physical Society in the JournalInformation Systems Department since 1995, supporting the onlinetransformation of the peer review process and related activities for thePhysical Review journals. His current position as lead data analyst involvescollecting... Read More →

Wednesday January 24, 2018 10:30am - 11:00am CET
Stage 2

Bridging worlds

11:30am CET

Scientific & financial data: exploring PID-based bridges, or lack thereof

Come hear the tales of Thunken, a small R&D team with a fetish for PIDs.

Our projects focus on linking disparate documents (publications, patents, clinical trials, drug approvals, financial reports, etc.) across multiple sources to track innovations from early publications and patents to market data and financial statements. The natural separation of concerns between different stakeholders (e.g. the USPTO, the NIH, the FDA, the SEC, and the IRS in the US) has generated silos that are hard to break down and standardize.

We will present our methodology and explain how a failed project on automated patent valuation led us to create a new solution for unrefined altmetrics data, currently powered by data released by Wikimedia, StackExchange, and other sources.

Speakers

Luc Boruta

Director of Research, Thunken

pidapalooza2018 thunken pdf

Wednesday January 24, 2018 11:30am - 12:00pm CET
Stage 2

Bridging worlds

12:00pm CET

Capturing facilities: PID recomendations for identifying scientific equipment and infrastructure

Scientific user facilities are specialized government-sponsored research infrastructure available for external use to advance scientific or technical knowledge. Researchers compete for access to the specialized equipment and infrastructure at these facilities. However, facility use is not captured in publisher workflows, making it difficult for the sponsor agencies and host institutions (typically government laboratories) to assess the scientific impact from these public investments. This issue was discussed by the Society for Science at User Research Facilities (SSURF) at the first PIDapalooza meeting in Rekjavijk in 2016. In 2017, ORCID convened a group of publishers and U.S. Department of Energy (DOE) user facilities managers to ascertain what data would help agencies and facilities to map impact, what PIDs would facilitate the description of the facilities, and how to enable collection in a manner that optimizes reporting of scientific impact. The Working Group is developing a set of findings and recommendations, which it will share with other facilities for comment in 3Q 2017. A presentation at PIDapalooza would provide a venue to share what we learned, solicit comment, and encourage the launch of pilot implementation projects.

Speakers

Erin Arndt

Wiley

Laure Haak

I care about effective infrastructures for supporting open research, scholarship, and innovation. Talk to me about persistent identifiers, researcher involvement in managing their own information, ensuring credit for a wide range of contributions, and privacy. Or the Packers. Or piano... Read More →

Crystal Schrof

Oak Ridge National Laboratory’s (ORNL)

Susan White-DePace

Advanced Photon Source, Argonne National Laboratory

Wednesday January 24, 2018 12:00pm - 12:30pm CET
Stage 1

Bridging worlds

2:00pm CET

Event Data: Bridging persistent and not-so-persistent identifiers

Event Data is a new service from Crossref. It collects links from the web to items of Crossref and DataCite Registered Content. It describes new forms of scholarship beyond traditional publishing, and forms an underlying data-set that can be used for altmetrics, amongst other things. It records this data as a stream of links between pairs of URLs (e.g. Tweets, blog posts, Wikipedia pages linking to scholarly articles). Sometimes those URLs are DOIs, sometimes they are publishers' article landing pages. In doing this, it bridges the world of persistent identifiers and plain old URLs. I'll describe the service, pitfalls, the trends we tend to see, and why you'd want to use it.

https://www.crossref.org/blog/bridging-identifiers-at-pidapalooza/

Speakers

Joe Wass

Head of Software Development, Crossref

Wednesday January 24, 2018 2:00pm - 2:30pm CET
Stage 3

Bridging worlds

2:30pm CET

Domination and submission: The struggle to retain ownership/control of national Research Information

Persistent Identifiers (PIDs) used in academic research live in two worlds: the world of open science and the world of strategic decision making. Whereas the objectives of open science aim to share as much as possible the resources and outcomes of publicly funded research, strategic management of research operates in the competitive space of attracting talent and winning funding grants.

This tension poses a particular challenge to mobilizing national sharing of research information. At present, academic publishers and related commercial service providers are poised to dominate the domain of Research Information. While the publishing industry provides many crucial services to the academic enterprise, the move to increased openness complicates this centuries-old relationship.

In this session, we draw on lessons learned from an ORCID pilot project to sketch a PID-centric Research Information strategy for the Netherlands.

Speakers

John Doove

Program manager, SURF

Passionate about open science and how to get there. Involved in the past in stuff like Virtual Research Environments, Enhanced Publications, Open Access, Open Research information. Currently striving to build bridges in open science between innovation and practice on the one hand... Read More →

Clifford Tatum

SURF / CWTS, Leiden University

Consultant, Persistent Identifiers (innovation group) at SURF, in the Netherlands. And researcher at CWTS, Leiden University, focusing on infrastructures of openness in relation to emerging evaluation practices.

Wednesday January 24, 2018 2:30pm - 3:00pm CET
Stage 2

Bridging worlds

3:30pm CET

Unsolved problems with PIDs and PID systems

The PID community has consolidated around a few key concepts: that PIDs are short, URI-compatible strings that are explicitly registered using a PID system, which maintains a database of such registrations; that PIDs carry some type and amount of associated citation metadata; and that PIDs resolve to URLs. So, are PIDs a solved problem? No! Here are four significant problems that remain unsolved. Solving them may require that the PID community collaborate to achieve newfound interoperability across PID types and systems, and to provide additional services beyond simple registration and resolution.

1. Are PIDs being used? How can we tell? That PIDs are being registered and assigned to resources is clear enough, but if a resource is cited or accessed via a local URL or other non-persistent identifier, then the existence of any PID it might have is pointless. This problem is exacerbated by the fact that the HTTP redirection mechanism exposes impermanent URLs to browsers and users, thus making it inevitable that such URLs will be bookmarked and subsequently used. Solving this problem may require work outside of PID systems proper, including web crawling and repository log analysis to detect resource references that could have been done through PIDs but weren't. Within PID systems themselves it may require providing additional services, such as facilitating reverse lookups (given a resource or URL, what PID(s) have ever been assigned to it?) and maintaining history of URL assignments (what has this PID ever identified?).

2. Who owns an identifier? Who may modify it? In the use case of a repository system that registers a PID for a resource it manages, the repository system is likely the "owner" in the sense that, should the repository move, it is the repository's responsibility to update the PID. But is it the sole owner? What if the resource author decides to move or copy the resource to a different repository? Who has rights to the identifier then, the old repository, the new, or the creator? Other use cases that involve institutions, libraries, departments, journals, and journal editors only add complexity. Some internet services, notably Wikipedia, have eschewed formal ownership models of resources, instead emphasizing the maintenance of history and the ability to undo change. Should PID systems adopt the same?

3. How can identifier aliasing, if not avoided altogether, at least be better handled? It is de facto common practice for repository systems to assign new PIDs to newly ingested resources regardless of the existence of any previously-assigned PIDs. This problem is particularly rife in the life sciences, where resources are often co-registered in multiple databases, receiving an identifier from each. The problem then, of course, is that having multiple, equivalent PIDs for a single resource sows confusion and dilutes citation metrics. At best, current PID systems record additional identifiers as "alternative identifiers," but this is far from sufficient and far from a universal practice. Solving this problem may require that PID systems maintain better and more comprehensive records of identifier aliasing (including aliasing that occurs across PID types and PID systems), and to support operations across whole "equivalence sets" of identifiers.

4. When a PID system itself moves or fails, what needs to persist? The awkward, but ultimately successful handover of the PURL system from OCLC to the Internet Archive should serve as a wakeup call to the PID community. While PID systems provide well-defined means of accommodating the movement of individual PID-identified resources over time, the movement of entire PID systems remains a very much ad hoc process. Must all the services and concepts of the old PID system be preserved? If not, which can safely be discarded? Note that the PURL system provided some unique characteristics in terms of user roles and resolution options. Solving this problem may require that standard models of PIDs and new forms of interoperability be adopted.

Speakers

Greg Janée

University of California at Santa Barbara

Wednesday January 24, 2018 3:30pm - 4:00pm CET
Stage 2

Bridging worlds

4:00pm CET

Managing PID collisions

PIDs are increasingly being used to identify research resources - things that are used to perform a research study. There are as many types of research resources - rocks, antibodies, protein structures, natural history collections, databases, software applications… - as there are identifier types applied to them - DOIs, RRIDs, IGSNs, Accession Numbers… What is the purpose of a research resource ID? Is there enough similarity between the resource types to justify a common approach? Do collisions between identifier types affect adoption - such as PDB IDs and DOIs for protein structures? Or organization IDs and RRIDs for user facilities? To manage collisions, are additional infrastructural elements required? What principles or policies could be helpful in enabling different ID systems to co-exist?

Speakers

Laure Haak

I care about effective infrastructures for supporting open research, scholarship, and innovation. Talk to me about persistent identifiers, researcher involvement in managing their own information, ensuring credit for a wide range of contributions, and privacy. Or the Packers. Or piano... Read More →

Kerstin Lehnert

Doherty Senior Research Scientist, Columbia University

Kerstin Lehnert is Doherty Senior Research Scientist at the Lamont-Doherty Earth Observatory of Columbia University and Director of the Interdisciplinary Earth Data Alliance that operates EarthChem, the System for Earth Sample Registration, and the Astromaterials Data System. Kerstin... Read More →

Maryann Martone

University of California San Diego

Johanna McEntyre

EMBL-EBI

Sarala Wimalaratne

Project Lead, EMBL-EBI

Wednesday January 24, 2018 4:00pm - 4:30pm CET
Stage 3

Bridging worlds

4:00pm CET

The ideal persistent identifier world

At a recent THOR project event (https://project-thor.eu/) we asked delegates what an ideal persistent identifier world would look like. They described a world in which reliable, dependable PID infrastructures (with zero downtime) were constantly delivering tangible benefits, such as time and effort savings, to researchers and others. In this world we would never have to make the business case or argue for the value proposition of PIDs. We’d be too busy reaping the benefits of using them. Ultimately, what we would like to see is a world in which a network of PIDs and the relationships between them can be used to capture the whole web of connections between things. The creation of this graph demands both greater coverage and adoption of PIDs, current and yet-to-be-built. We will widen this discussion out with a follow-up session at FORCE 2017 in Berlin (https://www.force2017.org/) to discuss the current gaps in the PID web and to prioritize PIDs for content, resources or connections that would be of most immediate perceived value for the scholarly community. We'd like to bring the results of this discussion to PIDapalooza to present them to the identifier folks there, and discuss how we can work together to fill those gaps. Maybe there will be some new collaborations or business opportunities created, maybe we'll find that we haven't done as good a job of promoting our services as we may have assumed. In any case, it should be food for thought...

Speakers

Josh Brown

Director, Partnerships, ORCID

Josh works with stakeholders, with a focus on research funders and our partners, to support understanding and engagement, and promote adoption of ORCID. He directs the operations of ORCID EU, leading the ORCID contribution to the THOR Project. He was previously the ORCID Regional... Read More →

Tom Demeranville

ORCID

Melissa Harrison

Team Leader, Literature Services, EMBL European Bioinformatics Institute

Team Leader, Literature Services, EMBL European Bioinformatics Institute

Wednesday January 24, 2018 4:00pm - 4:30pm CET
Stage 1

Bridging worlds