PHAIDRAcon'21
Day 1, Roundtable - November 17, 2021 - Reevaluating Open Source in Academia
Much has been made of the role and value of open source software within academia. But, is open source, on its own, the panacea that it was once thought to be? After all, open source is really only about a software license.
As open source software has matured and, in many areas, become the new norm, people are looking deeper into the mechanics of collaboration. In the first of this year’s Phaidracon debates, we will explore the relevance of this debate to the Phaidra project’s objective of sustainable, long-term data preservation.
Questions
- Is open source code a solution for software longevity, or is this more about community, collaboration and agreed standards?
- In the long term, are we better to decouple data (as far as possible) from the short term liabilities of software?
- And in a world of tightening academic budgets, is sustainability more about ensuring that we use technology effectively to derive ongoing value from the data we seek to preserve?
Speakers Roundtable 1
Raman Ganguly earned his degree in Media Engineering at the St. Pölten University of Applied Science, and he is a Level C IPMA project manager. Before finishing his university degree, he worked as a software developer in different companies and headed web-development departments at two media agencies.
He became part of the team of the Computer Centre at the University of Vienna in 2008. Since 2011 one of his main focuses is the management and archiving of research and educational data. In this capacity he is responsible for designing the technical infrastructure of the data management ecosystem of the University and for the sustainable operation of the technical infrastructure for long-term data preservation. He is the technical director of the PHAIDRA digital asset management system for long-term preservation. PHAIDRA is currently used by the University of Vienna and 21 institutions throughout Europe.
G. Sayeed Choudhury is the Associate Dean for Research Data Management and Hodson Director of the Digital Research and Curation Center at the Sheridan Libraries of Johns Hopkins University. He leads the University's open source programs office (OSPO). Choudhury is also a member of the Executive Committee for the Institute of Data Intensive Engineering and Science (IDIES) based at Johns Hopkins.
Choudhury was a President Obama appointee to the National Museum and Library Services Board. He was a member of the National Academies Committee on Forecasting Costs for Preserving, Archiving, and Promoting Access to Biomedical Data. He was a member of the National Academies Board on Research Data and Information and the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. He has testified for the Research Subcommittee of the Congressional Committee on Science, Space and Technology.
He was a member of the board of the National Information Standards Organization, OpenAIRE2020, DuraSpace, the ICPSR Council, Digital Library Federation advisory committee, Library of Congress' National Digital Stewardship Alliance Coordinating Committee, Federation of Earth Scientists Information Partnership (ESIP) Executive Committee and the Project MUSE Advisory Board. Choudhury was a member of the ECAR Data Curation Working Group. He has been a Senior Presidential Fellow with the Council on Library and Information Resources, a Lecturer in the Department of Computer Science at Johns Hopkins and a Research Fellow at the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign. He is the recipient of the 2012 OCLC/LITA Kilgour Award.
Choudhury has served as principal investigator for projects funded through the National Science Foundation, Institute of Museum and Library Services, Library of Congress' NDIIPP, Alfred P. Sloan Foundation, Andrew W. Mellon Foundation, Open Society Foundation, Microsoft Research, and a Maryland based venture capital group.
He is the Product Owner for the Data Conservancy which focuses on the development of data curation infrastructure and the Public Access Submission System which supports simultaneous submission of articles to PubMedCentral and institutional repositories. He has oversight for data curation research and development and data archive implementation at the Sheridan Libraries at Johns Hopkins University. Choudhury has published articles in journals such as the International Journal of Digital Curation, D-Lib, the Journal of Digital Information, First Monday, and Library Trends. He has served on committees for the Digital Curation Conference, Open Repositories, Joint Conference on Digital Libraries, and Web-Wise. He has presented at various conferences including Educause, CNI, JISC-CNI, DLF, ALA, ACRL, and international venues including IFLA, the Kanazawa Information Technology Roundtable, eResearch Australasia, the North America-China Conference, eResearch New Zealand and the Arabian-Gulf Chapter of the Special Libraries Conference.
Daniel Bernstein is currently the technical lead for the Fedora Repository Project and staff member at Lyrasis. He is responsible for helping to set priorities and coordinating development on Fedora and related ecosystem projects. He also leads workshops, technical presentations, as well as the weekly Fedora tech community call. In addition to his work on Fedora, he is a core developer of the DuraCloud codebase. Prior to joining Lyrasis (formerly DuraSpace), Daniel was the principal developer of Archive-It while working for the Internet Archive.
Radu Meza is an Associate Professor at the Journalism and Digital Media at the College of Political, Administrative and Communication Sciences, Babeș-Bolyai University, coordinator of the first Digital Media Bachelor program in Romania (since 2016), Head of the Journalism and Digital Media Department (since 2021), Chair of the Babeș-Bolyai Senate Committee for Curriculum (since 2020), evaluator for Romanian Agency for Quality Assurance in Higher Education (since 2018). He teaches Media Analysis, Digital Data Analysis, New Media Theory, Content Management Systems and Web Design. Radu Meza’s recent research focuses on analyzing dangerous speech, hate speech and offensive speech in Romanian and Hungarian Public Facebook contexts using computational sociology approaches.
Day 2, Roundtable 2 - November 18, 2021 - Digital Humanities in the API Economy
The inner workings of organisations from lean tech start-ups, to government departments and the largest global corporations are all being abstracted into open, standardised and easily accessible Application Programming Interfaces (APIs).
Shielded from the complexities of the underlying IT systems, the API economy is giving birth to a new breed of online entrepreneurship. How do we get things (products, services, information) to people in new and exciting ways? How do we create new experiences that blend the online, compute power and the physical?
Questions
- What does all this mean for digital humanities?
- Is there a new model for digitization of our cultural heritage?
- What opportunities does this open up for Cross-discipline/Cross-organisational sharing and collaboration?
- Is there a new blurring of the boundaries between academia, museums and archives, amateur enthusiasts and the general public?
- Does this offer new service/revenue opportunities?
- Most importantly, how will this work in reality?
Speakers Roundtable 2
Raman Ganguly earned his degree in Media Engineering at the St. Pölten University of Applied Science, and he is a Level C IPMA project manager. Before finishing his university degree, he worked as a software developer in different companies and headed web-development departments at two media agencies.
He became part of the team of the Computer Centre at the University of Vienna in 2008. Since 2011 one of his main focuses is the management and archiving of research and educational data. In this capacity he is responsible for designing the technical infrastructure of the data management ecosystem of the University and for the sustainable operation of the technical infrastructure for long-term data preservation. He is the technical director of the PHAIDRA digital asset management system for long-term preservation. PHAIDRA is currently used by the University of Vienna and 21 institutions throughout Europe.
Marta Palandri, software developer, originally earned her Master’s degree in Anglophone Literatures and Cultures (University of Vienna). In her free time she likes to explore the meeting point between literature and technology. She joined the University of Applied Arts Vienna as a back-end developer in 2020.
Franco Niccolucci is the director of VAST-LAB research laboratory at PIN in Prato, Italy. A former professor at the University of Florence until 2008, he has directed the Science and Technology in Archaeology Research Center at the Cyprus Institute, Nicosia, until 2013. Prof Niccolucci has coordinated several EU-funded projects on the applications of Information Technology to Archaeology, and is currently the coordinator of ARIADNEplus, a research infrastructure on archaeological data. His main research interests concern knowledge organization of archaeological documentation and the communication of cultural heritage. He is currently the Editor-in-Chief of JOCCH, the ACM Journal of Computing and Cultural Heritage. He has authored about 100 papers and book chapters.
Elisabetta Lazzaro is Professor of Creative and Cultural Industries Management at the Business School for the Creative Industries, University for the Creative Arts (UK). Her research focuses on the economics, management, entrepreneurship and policy of the arts, culture and creative industries, including the boundary-spanning challenges and opportunities for cultural heritage and applied digitisation as drivers for sustainable socio-economic innovation in urban and regional development. www.uca.ac.uk/staff-profiles/elisabetta-lazzaro
Arianna worked as Science Officer at the European Science Foundation (Humanities) in Strasbourg (France) from 2009 to 2012. Her primary responsibilities included the supervision of instruments to fund collaborative research in the humanities and the coordination of strategic activities related to the works of the Standing Committee for the Humanities.
She lived in Singapore until recently where she continued working as free lance consultant for the ESF on research evaluation missions and peer review activities as well as maintaining an active role in the digital humanities community by serving in international committees and contributing to various activities in the field. She recently relocated to London where she is looking forward to combine her freelance work with new exciting professional experiences.
As Research Associate at Centre for Computing in the Humanities, King's College London, she worked on various digital humanities research projects as leading analyst or support.
Her personal research interests focus on the modelling of scholarly digital resources related to primary sources. Modelling is at the essence of language and knowledge, an act that is refractory to classification as much as revealing about what we humans are and are not.
She lectured and published on digital humanities, in particular on digital palaeography and digital philology, scholarly representation and modelling; she has organised conferences and workshops in digital humanities, and is an active member of its international community.
Arianna graduated with BA (Hons) in Communication sciences (computational linguistics) at the University of Siena in 2001. She received an MA in Applied Computing in the Humanities from King's College London in 2004 and was awarded her PhD in Manuscript and Book Studies from the University of Siena in 2005.
Day 3, Conference Day - November 19, 2021
PHAIDRAcon'21 Conference Day
Abstract
Knowledge about institutional repositories is increasingly becoming a mandatory feature of Digital Humanities teaching: not only do students (and thus future researchers) need to understand the workflows behind the sustainable storage of their research data; they also need to be aware of where their institutional repository stands within the broader ecosystem of tools and infrastructures. Against that background, the presentation will argue for including data management knowledge into regular Digital Humanities teaching, and provide some examples of how to put this into practice.
About Thomas Wallnig
Thomas Wallnig is a historian working at the university of Vienna (AT), teaching, among other things, courses in the university’s recently established Digital Humanities Masters’ Program. In the academic year 2020/1 he also acted as a visiting professor at the universities of Padua and Klagenfurt. His background is that of an intellectual historian of early modern Catholic Europe (Critical Monks – The German Benedictines, 1680–1740 | Brill). During the past years, especially in his role as co-chair of COST Action IS1310, he has developed a broader interest in tools and methods of computer-based analysis, especially with regard to early modern letters: https://univerlag.uni-goettingen.de/handle/3/isbn-978-3-86395-403-1
Abstract
Serial sources such as toll registers, municipal account books, company accounting records, or trade registers provide valuable data for economic historians. Because they allow scholars to study the evolution of economic behaviour at both the macro- and microstructural levels, these books have long been used as primary sources in economic and social history. However, the sheer volume of data has often hindered scholarly editions and detailed research. On the one hand, online editions can open up new opportunities for dealing with this type of source, but on the other hand they create many new problems, especially for historians without a sound technical background.The Danube Trade Project, which has been running since 2008, makes available essential sources on the economic history of Austria during the 17th and 18th centuries in the form of open-access databases. The analysis of these databases can be particularly helpful in identifying the activities of trade networks as well as single merchants, shipmasters, and other individuals who made up the bulk of market participants. They also provide quantitative and qualitative data for the study of trade cycles, the type, and quantity of the commodities transported on the river, and migratory processes — to name just a few examples.
The main objective of this paper is firstly to present the sources and databases of the “Danube Trade Project” and secondly to discuss the challenges, failures and possible solutions to problems encountered during the (still ongoing) editing process. http://www.univie.ac.at/donauhandel/
About Andrea Serles
Andrea Serles studied History at the University of Vienna, where she holds the position of a research assistant since 2013. Before that, she was working at the Institute for Medieval and Early Modern Material Culture (IMAREAL, subdivision of the Austrian Academy of Science/University of Salzburg). The main focus of her scientific work is the edition of sources related to the Danube trade in early modern times.
Apart from history of trade, Mrs. Serles’ research interests cover the history of public finance, administration, and constitution as well as town history.
Abstract
„IMAGE+ Platform for Open Art Education“ is an Austrian image and image research platform dedicated to the enhancement of teaching. IMAGE+ offers a comprehensive repository of high-quality digital image reproductions of artistic works. The images are enriched with high-quality metadata; scientifically validated information about the artworks is provided. IMAGE+ is available for teachers and students at participating universities. Furthermore, artists and art education graduates may use the database for their daily work and ongoing training. The project is anchored at the University of Applied Arts Vienna and is realized in cooperation with the University of Art and Design Linz, University Mozarteum Salzburg, Austrian Academy of Sciences (ÖAW) and the documentation platform of Austrian art basis Wien.
https://www.angewandtekunstgeschichte.net/forschung/image-platform-for-open-art-education
About Astrid Poyer
Since 2015 Astrid Poyer is research assistant at the Department of Art History at the University of Applied Arts Vienna (a.o. FWF project: A Matter of Historicity – Material Practices in Audiovisual Art). Currently part of the ongoing research project IMAGE+ Platform for Open Art Education.
About Charlotte Reuß
Since the end of 2020 Charlotte Reuß is part of the research project IMAGE+ Platform for Open Art Education at the University of Applied Arts Vienna. Her research interests include digital cultures, accessibility and economization of culture, and media art. In addition, she is working on my ongoing dissertation project "Visual Arts' Accessibility in the Digital Realm".
PHAIDRAcon'21 Conference Day contd.
Abstract
This presentation will briefly outline the opportunities that arise when dealing with data management in the field of children's and youth media research. A survey has shown that there are still many open questions in this area. Often, terms like "repositories" or "permanent identifiers" are not familiar. There is also still a great deal of uncertainty about legal issues. Dealing with licences is also still quite new and unfamiliar for many. This also means that data that could be shared is often not shared. A logical consequence of this is that data that has already been collected is often not re-used. Another challenge is dealing with data that originates, for example, from the context of National Socialism, such as magazines or children's books. How do you deal with descriptions in repositories, what is allowed to be cited and how? These questions will probably have to be discussed more in the future.
About Susanne Blumesberger
Ph.D. in Media and Communication Studies and German Studies at the University of Vienna. Head of the Department Repository Management PHAIDRA-Services. Lecturer at the University of Vienna. Chairperson of the Austrian Society for Research on Children's and Youth Literature
Abstract
This talk will look at the central role of metadata in accessing, using, and conceptualizing visual representations of analog objects and monuments. In particular, I will focus on issues connected to multiple instantiations (i.e. copies) of a given representation, usually as a result of a media change. Even though several internationally recognized standards such as VRA Core, CDWA or even EDM actively try to bridge the blurring of description and thus representation levels, their actual implementation often results in confusion on the user level. I will clarify these points through examples from my work with art history students and professionals. Finally, I will aim to suggest possible pathways for overcoming the disconnect between cultural heritage data providers and users.
About Fani Gargova
Fani Gargova is a Lecturer in Art History at the University of Vienna. She received her doctorate from the same institution in 2019 with a thesis on the Central Synagogue of Sofia. Previously, she was Byzantine research associate at the Image Collections and Fieldwork Archives of Dumbarton Oaks, Harvard University, and project coordinator of the Digitales Forschungsarchiv Byzanz (DiFaB) at the University of Vienna. Her research has been supported by the Austrian Academy of Sciences, the Rothschild Foundation (Hanadiv), and the IFK in Vienna. In her various roles, she has worked towards expanding the accessibility and visibility of architectural and art historical documentary material through exhibitions and digital humanities projects, the most recent being the online exhibition Das Erbe von Byzanz on the collection of historic photographs housed at the Vienna Department of Art History (2021).
Abstract
Collections move digital for more than two decades now. This talk is centered around collections traditionally curated in Botanical Institutions. The University of Vienna houses a estimated 1.4 million botanical and mycological objects of global provenance. Beginning in the year 2000 we started cataloguing collection objects in the JACQ system www.jacq.org and liberated data for usage by a wider audience to a global aggregator www.gbif.org and in parallel to the Austrian branch www.gbif.at simultaneously providing possibilities for visualization of the digitized objects. From the beginning the system was laid out to host data from multiple institutions to more than 40 institutions from 17 countries joined the Consortium.
The presentation will illustrate the underlying concepts of the platform, show the chronology of the developments, current capabilities and implementation of FAIR concepts and semantic web technologies. It will also provide insights how JACQ is embedded, contributes, and is interlinked with institutional, national, and international projects and initiatives.
About Heimo Rainer
Heimo Rainer studied Botany and Zoology at the University of Vienna, working for 21 years in the Herbarium of the University of Vienna focused on the digitization of the collections. Since 2005 a parallel employment at the Natural History Museum in Vienna with a similar focus on the digitization at the botanical department. By the end of August 2021, he left the University of Vienna for a full employment at the NHM. Main fields of engagement besides extending the JACQ consortium currently include efforts in national and international projects including participation and contribution to the European Open Science Cloud (EOSC), coordinating the group of Open Scientific Collections Austrian (OSCA), and preparatory steps for joining the European ESFRI initiative Distributed System of Scientific Collections (DiSSCo.eu).
PHAIDRAcon'21 Conference Day contd.
Abstract
This contribution discusses progress, solutions and further plans in the field of repositories, data management and long-term preservation of research data in the humanities, as formulated and developed within the framework of national and international initiatives (CLARIAH-AT, CLARIN and DARIAH, EOSC). Solutions are presented using the example of the ARCHE repository, which is operated by the ACDH-CH.
About Matej Ďurčo
Matej Ďurčo has been instrumental to the conception and growth of the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) founded in 2015. Since 2009, he has been part of the Austrian research infrastructures core group, contributing substantially to the development of key technical components of the research infrastructures CLARIN and DARIAH. In his role as head of the ACDH-CH’s technical group, he has been coordinating the development of applications and the provision of services for the numerous projects of the institute and its cooperation partners.
Abstract
As policies, good practices and mandates on research data management evolve, more emphasis has been put on the licencing of data (often in the context of the reproducibility of research). Licencing information allow potential re-users to quickly identify what they can do with the data in question and is therefore an important component of metadata.
In this paper I analyse the pre-existing collection of 840 Horizon 2020 public data management plans (DMPs) available on the Phaidra repository, to determine which ones mention creative commons licences and among those who do, what licences are being used.
I find that 36% of DMPs mention creative commons and among those a number of different approaches towards licencing (overall policy per project, licencing decisions per dataset, licencing decisions per partner, licensing decision per data format, licensing decision per perceived stakeholder interest) exist, often clad in rather vague language with CC licences being “recommended” or “suggested”. Some DMPs also “kick the can further down the road” by mentioning that “a” CC licence will be used, but not which one. However, among those DMPs that do mention specific CC licences, a clear favourite emerges: the CC-BY licence, which accounts for half of the total mentioning of a specific licence.
The fact that 64% of DMPs did not mention creative commons at all is an indication for the need for further training and awareness raising on data management in general and licencing in particular in Horizon Europe. For those DMPs that do mention specific licences, 60% would be compliant with Horizon Europe requirements (CC-BY or CC0). However, it should be carefully monitored whether the other 40% of content that is currently licenced with non- Horizon Europe compliant licences will in the future move to CC-BY or CC0 or whether such content will in the future simply be kept fully closed by projects through invoking the “as open as possible, as close as necessary” principle.
About Daniel Spichtinger
Daniel Spichtinger is an independent consultant working on open science, including open access and data management policies.
From 2012-2018 he was a member of the unit dealing with open science in the European Commission’s Directorate-General for Research and Innovation.
In this capacity he contributed to the development of EU open access policies (for scientific publications) and open/FAIR research data policies, including the design and implementation of the Open Access to Research Data Pilot in Horizon 2020.
As part of his job, he also developed relations and facilitated inter-institutional and policy dialogue with external stakeholders and other EU institutions. This also involved information and awareness raising on open access, e.g. through public presentations and trainings. He was also responsible for managing several EU funded projects in this area, e.g. RECODE. He is familiar with European level legislation on the subject including the Horizon 2020 regulation, the Recommendation on Access to and Preservation of Scientific Information and the proposed provisions for open access in Horizon Europe.
After finishing his 6 year contract with the Commission, Daniel returned to Vienna and registered as a self-employed expert for Open Science and EU Research Policy in 2018. In this capacity, he has been involved in a number of projects such as
- An Assessment of EOSC readiness in three European countries (for RFII)
- An analysis of Horizon 2020 Data Management Plans (for OpenAIRE/the University of Vienna)
- A study of Open Access and Open Data in Azerbaijan (for IDI/EuropeAID)
- Recommendations for an Open Access and Research Data Policy in Malta (for EC Policy Support Facility)
He has also provided a number of trainings and publications (see next page) on open access and FAIR data in the context of open science. He is also employed part time at the Ludwig Boltzmann Gesellschaft (LBG) where he advises the LBG on third party funding.
Daniel initially obtained a joint "Magister" (Mag.phil) degree from the University of Vienna (Austria) in English, Communication Science and History (2000), writing his thesis about the global spread of English. He also obtained a Master of Arts in Contemporary European Studies from the University of Bath (UK, 2002), where he first encountered European research policy and tackled the issue of involving civil society in the EU’s Sixth Framework Programme for Research.