Dataverse is repository software developed by the Institute for Quantitative Social Science of Harvard University to enable researchers to archive and publish research data. It's supported by an active Dataverse Development Community with contributors around the world. From fixing bugs to writing documentation as well as creating integrations and client libraries, the community is a major part of what makes the Dataverse software successful. Fifty-nine institutes worldwide are using the software to establish a research data management solution for their own community.
Currently, the User Interface (UI) is in English. Task 5.2 of the SSHOC project integrated a tool (Weblate) into the installation pipeline of Dataverse to translate the UI more easily. This procedure can be used by organisations that are interested to have a UI in their national language(s). On June 2, we will organise a workshop to explain the ins and outs of the tool, discuss procedures and plan collaboration among translators.
The content of the workshop will be as follows:
Final translations files should be saved in the Dataverse Community Github. So everyone can build on top of it. However, if you are the only one interested in a specific language, you will probably have to maintain it all by yourself. CESSDA is investigating how to support the sustainability within CESSDA.
At DANS, we have not made a final decision about the need of a Dutch interface. If we decide we need it, of course it would be a good idea to cooperate. It is good to look for opportunities to work together on translations.
The translation process can be started independently from IQSS/Harvard. How to start a translation in Weblate is described in the Guide created for this workshop. If there is no project yet in Weblate for your Dataverse version, please let us know by sending an e-mail to training@cessda.eu. We can create projects for previous versions, but also for the most recent release (currently version 5.5). Once the translation is final, you can save it in the Dataverse Global Community Github. This Github exists next to the official Dataverse Github and is managed by the Global Dataverse Community Consortium (GDCC), not just Harvard. It is up to each Dataverse installer to include a different language or not.
From past experiences, we can say that a professional translator with experience in technical translation needed around 15 work days to translate Dataverse version 4.8. (including metadata blocks and SOLR search fields). A data curator very familiar with the front end of Dataverse doing the translations only every now and then needed several weeks.
You can connect Weblate with Git (and Github). From the official Weblate documentation: “Weblate currently supports Git (with extended support for GitHub, Gerrit and Subversion) and Mercurial as version control back-ends.” Weblate also has an API, so it is also possible to develop a connection yourself.
As an alternative, you can download the final translation file from Weblate, and upload it manually to your own system.
If you want to offer the Dataverse UI in another language than the default English, you can configure this in the dataverse settings. This is described in the dataverse Guide.
The user will be presented with a dropdown menu in the top navigation bar to choose the desired language. Here is one example of an English and French User Interface.
The metadata itself will stay in the original language. If the metadata was filled in in English, the switch to for example a French User Interface does NOT result in seeing the metadata in French. Only the metadata field label is translated.
For interested organisations who provide metadata in both their national language and in English, there is no solution yet. Dataverse does not have a language attribute for the metadata fields. It would be good if this can be changed. The language issue for the content of metadata, e.g. in keywords, will also be addressed in the ongoing work to implement controlled vocabulary support in Dataverse.
Apparently, something went wrong with your user settings. How to add a language is described in the Guide written for this workshop. See paragraph 2.2.
Go to Languages - if your language is not there, click ‘Start new translation’ and wait for the application to create a new language (and fetch the source language from GitHub). This will take some time. Another option is to go to “FIles” and upload a file with translated strings. This is useful when you already have a partly translated Bundle.properties file. You can choose how you would like to upload the already translated strings in Weblate. For example, you can decide to mark these strings with a ‘Needs Editing’ status automatically.
The translation made in Weblate can be downloaded from the application any time. (Please see the Guide for screenshots.)
Translations that are available can be found in the Github repository of the Global Dataverse Community Consortium. Please note that some translations are only available for older versions of dataverse. Sometimes there might be multiple files available for the same language, you can then check who did the translation and choose the most trustworthy one.
Yes, this is a plan. This will enhance the collaboration. For this workshop we have used a French translation in Weblate that we have downloaded from this community Github. Currently there is no automated upload from Weblate to Github, but we may resolve this in the future. For now we can do the upload manually. It is recommended to add the available languages to the GitHub community as soon as they are created, playing by community rules. This also helps in getting informed about work in
progress on translations. We also need to check the licence for translations - perhaps the license states that you will need to contribute back, for example when a GPO licence is used. The Dataverse software is licensed under the Apache License, Version 2.0.
Can you reuse translations of past (other) versions for new versions (example use of Swedish 4.9 to Norwegian 5.X)?
Yes, this is possible. There are two options. You could import the Swedish 4.9 language file as if it was a Norwegian translation. You can import it in Weblate with the ‘Needs Editing’ status, for example. In Weblate, you can select these strings and make alterations to it to create the Norwegian translation.
The second option is to upload the Swedish translation file not under the Norwegian language, but as Swedish. If you indicate in your Weblate profile settings that you are proficient in Swedish, Weblate will then show you the Swedish translation while you are working on your Norwegian translation. If from version 4.9 to version 5.x, new lines were added to the source language file, you would need to translate these specific lines without any previous Swedish input.
Cyrillic options are available, but are difficult for us to test, due to different keyboard(settings). Please contact the team if you would like to test this together or if you encounter any problems with it. It is also possible to add frequently used symbols to the menu in the Weblate interface, so you can easily select them while you are translating.
Some important questions you should consider are:
● Which version of Dataverse do we need to translate?
● Who will do the translation? Do we have resources to hire a professional translator with a background in technical translations?
● Does our language have special forms that we need to discuss early on (for example female/male gender in German; cyrillic alphabet)
The user guide should be extended with a section with experiences from translators. This could be a community effort to gather this information.
The user guide created for this workshop will be updated with the input from this Q&A session. We will use the CESSDA Dataverse basecamp for communication. If you are not part of CESSDA, please contact the Task 5.2 team, via an email to training@cessda.eu if you would like to join. We will send you a follow-up email that will also contain information about the communication channels.
Veronika Heider is senior data curator at AUSSDA, the Austrian Social Science Data Archive. She contributes her expertise in making metadata, data and documentation in the AUSSDA Dataverse findable and usable to SSHOC task 5.2.
Laura Huis in ‘t Veld is Information Systems Officer at DANS. She is responsible for user support, configuration and testing of the DataverseNL platform. She is involved in task 5.2 of the SSHOC project and was previously involved in European projects, such as the CESSDA DataverseEU project.
Marion Wittenberg is a service manager at DANS for DataverseNL, a repository service for Dutch universities and research organisations. She is also task leader of task 5.2 of the SSHOC project which adjusts the Dataverse software to the needs of the European SSH community.
See SSHOC Service Catalogue for more information on the SSHOC Dataverse Service.
Previous events: