PIDapalooza 2018 has ended

Welcome to PIDapalooza 2018...where anything goes...as long as it goes on forever.   

Wednesday, January 24 • 10:30am - 11:00am
Open Refine cleans up messes

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

OpenRefine (formerly Google Refine) is a tool for cleaning up messy data, but it is also well-suited to cross-linking different identifiers through their metadata entries. I will share some experiences in using OpenRefine to interlink GRID, ISNI, Wikidata, and our own internal institution identifiers. This matching process also highlights metadata errors (including duplicates and incorrect merges) in the different sources, and can help improve data quality on all sides.

avatar for Arthur Smith

Arthur Smith

American Physical Society
Arthur Smith has worked for the American Physical Society in the JournalInformation Systems Department since 1995, supporting the onlinetransformation of the peer review process and related activities for thePhysical Review journals. His current position as lead data analyst involvescollecting... Read More →

Wednesday January 24, 2018 10:30am - 11:00am CET
Stage 2