What translators can learn from the F/OSS community

Looking through my blog archives late last year, I was disappointed to discover I’d posted only seven articles in all of 2013: usually my goal is to get at least one post up every month, and last year was the first time since 2009 that I hadn’t been able to achieve that. So my goal for this year is to blog more frequently and more consistently. And with that, here is my first post of 2014:

In November, I came across a blog post that hit on a number of issues relevant to the translation industry, even though it was addressed to the Free/Open-Source Software (F/OSS) community. It’s called The Ethics of Unpaid Labour and the OSS Community, and it appeared on Ashe Dryden’s blog. Ashe writes and speaks about about diversity in corporations, and so her post focused on how unpaid OSS work creates inequalities in the workforce. As she argues, the demographics of OSS contributors differs from that of proprietary software developers as well as the general population, with white males overwhelmingly represented among OSS contributors: One source Ashe cites, for instance, remarks that only 1.5% of F/OSS contributors are female, compared to 28% of contributors for proprietary software. Ashe notes that lack of free time among groups that are typically marginalized in the IT sector (women, certain ethnic groups, people with disabilities, etc.) is the main reason these groups are under-represented in OSS projects.

These demographics are problematic for the workforce because many software developers require their employees (and potential new hires) to have contributed to F/OSS projects. And while some large IT firms do allow employees to contribute to such projects during work hours, people from marginalized groups often do not work at these kinds of companies. This means people who would like to find paid employment as software developers probably need to be able to devote unpaid hours to F/OSS projects so they have a portfolio of publicly available code for employers to consult.

So how is this relevant to translators, or the translation industry? It’s relevant because the same factors affecting the demographics of the F/OSS community are also likely to affect the demographics of the crowdsourced translation community. People can volunteer to translate the Facebook interface only if they have free time and access to a computer; likewise, people with physical disabilities that make interacting with a computer difficult are likely to spend less time participating in crowdsourced projects than people with no disabilities. And since, in many cases, the community of translators participating in a crowdsourced project will largely determine how quickly a project is completed, what texts are translated and what language pairs will be available, the profile of participants is important.

Unfortunately, we don’t have a lot of data about the profiles of people who participate in crowdsourced (or volunteer) projects. The studies that have been done do hint at a larger question worth exploring: O’Brien & Schäler’s 2010 article on the motivations of The Rosetta Foundation’s volunteer translators noted that the group of translators identifying themselves as “professionals” was overwhelmingly female (82%), while the gender of those identifying themselves as amateurs was more balanced (54% female). The source languages of the volunteers were mainly English, French, German and Spanish. My own survey of Wikipedia translators found that 84% of the respondents were male and 75% were younger than 36. Because both these projects show that people with certain profiles participated more than others, it’s clear there’s a need for more research. If we had a better idea of the profiles of those who participate in other crowdsourced translation projects, we would be able to get see whether some projects seem more attractive to one gender, which language pairs are most often represented, and what kinds of content is being translated for which language communities. And we could then try to figure out whether (and if so, how) to make these projects more inclusive.

Since it’s still a point of debate whether relying on crowdsourcing to translate the Twitter interface, a Wikipedia article or a TED Talk is beneficial to the translation industry, let’s leave that question aside for a moment and just consider the following: one of the benefits both novice and seasoned translators are supposed to be able to reap from participating in a crowdsourced project is visible recognition for their work. Online, accessible users profiles are common in crowdsourcing projects (as in this example from the TED translator community), and translators are usually given visible credit for their contributions to a project (as in this Webflakes article, which credits the author, translator and reviewer). If certain groups of people are more likely to participate in crowdsourced projects, that means this kind of visibility is available only to them. If we take a look at where the software development industry is headed (with employers actively seeking those who have participated in F/OSS projects, putting those who cannot freely share their code at a disadvantage), translators could eventually see a similar trend, putting those who are unable (or who choose not) to participate in crowdsourced projects at a disadvantage.

I think this is a point worth considering when crowdsourced projects are designed, and it’s certainly a point worth exploring in further studies, as it raises a host of ethical questions that deserve a closer look.

Translation in Wikipedia

I’ve been busy working on a new paper about translation and revision in Wikipedia, which is why I haven’t posted anything here in quite some time. I’ve just about finished now, so I’m taking some time to write a series of posts related to my research, based on material I had to cut from the article. Later, if I have time, I’ll also write a post about the ethics and challenges of researching translation and revision trends in Wikipedia articles.

This post talks about the corpus I’ve used as the basis of my research.

Translations from French and Spanish Wikipedia via Wikipedia:Pages needing translation into English

I wanted to study translation within Wikipedia, so I chose a sample project, Wikipedia:Pages needing translation into English, and compiled a corpus of articles translated in whole or in part from French and Spanish into English. To do this, I consulting recent and previous versions of Wikipedia:Pages needing translation into English, which has a regularly updated list split into two categories: the “translate” section, which lists articles Wikipedians have identified as having content in a language other than English, and the “cleanup” section, which lists articles that have been (presumably) translated into English but require post-editing. Articles usually require cleanup for one of three reasons: the translation was done using machine translation software, the translation was done by a non-native English speaker, or the translation was done by someone who did not have a good grasp of the source language. Occasionally, articles are listed in the clean-up section even though they may not, in fact, be translations: this usually happens when the article appears to have been written by a non-native speaker of English. (The Aviación del Noroeste article is one example). Although the articles listed on Wikipedia:Pages needing translation into English come from any of Wikipedia’s 285 language versions, I was interested only in the ones described as being originally written in French or Spanish, since these are my two working languages.

I started my research on May 15, 2013, and at that time, the current version of the Wikipedia:Pages needing translation into English page listed six articles that had been translated from French and ten that had been translated from Spanish. I then went back through the Revision History of this page, reading through the archived version for the 15th of every month between May 15, 2011 and April 15, 2013: that is, the May 15, 2011, June 15, 2011, July 15, 2011 all the way to April 15, 2013 versions of the page, bringing the sample period to a total of two years. In that time, the number of articles of French origin listed in either the “translate” or “clean-up” sections of the archived pages to a total of 34, while the total number of articles of Spanish origin listed in those two sections was 60. This suggests Spanish to English translations were more frequently reported on Wikipedia:Pages needing translation into English than translations from French. Given that the French version of the encyclopedia has more articles, more active users and more edits than the Spanish version, the fact that more Spanish to English translation was taking place through Wikipedia:Pages needing translation into English is somewhat surprising.

Does this mean only 94 Wikipedia articles were translated from French or Spanish into English between May 2011 and May 2013? Unlikely: the articles listed on this page were identified by Wikipedians as being “rough translations” or in a language other than English. Since the process for identifying these articles is not fully automated, many other translated articles could have been created or expanded during this time: Other “rough” translations may not have been spotted by Wikipedia users, while translations without the grammatical errors and incorrect syntax associated with non-native English might have passed unnoticed by Wikipedians who might have otherwise added the translation to this page. So while this sample group of articles is probably not representative of all translations within Wikipedia (or even of French- and Spanish-to-English translation in particular), Wikipedia:Pages needing translation into English was still a good source from which to draw a sample of translated articles that may have undergone some sort of revision or editing. Even if the results are not generalizable, they at least indicate the kinds of changes made to translated articles within the Wikipedia environment, and therefore, whether this particular crowdsourcing model is an effective way to translate.

So what are these articles about? Let’s take a closer look via some tables. In this first one, I’ve grouped the 94 translated articles by subject. Due to rounding, the percentages do not add up to exactly 100.

Subject Number of French articles Percentage of total (French) Number of Spanish articles Percentage of total (Spanish)
Biography 20 58.8% 20 33.3%
Arts (TV, film, music, fashion, museums) 3 8.8% 8 13.3%
Geography 2 5.8% 12 20%
Transportation 2 5.8% 4 6.7%
Business/Finance

(includes company profiles)

2 5.8% 2 3.3%
Politics 1 2.9% 4 6.7%
Technology (IT) 1 2.9% 1 1.6%
Sports 1 2.9% 1 1.6%
Education 1 2.9% 1 1.6%
Science 1 2.9% 0 0%
Architecture 0 0% 2 3.3%
Unknown 0 0% 3 5%
Other 0 0% 2 3.3%
Total 34 99.5% 60 99.7%

Table 1: Subjects of translated articles listed on Wikipedia:Pages needing translation into English (May 15, 2011-May 15, 2013)

As we can see from this table, the majority of the translations from French and Spanish listed on Wikipedia:Pages needing translation into English are biographies—of politicians, musicians, actors, engineers, doctors, architects, and even serial killers. While some of these biographies are of historical figures, most are of living people. The arts—articles about TV shows, bands, museums, and fashion—were also a popular topic for translation. In the translations from Spanish, articles about cities or towns in Colombia, Ecuador, Spain, Venezuela and Mexico (grouped here under the label “geography”) were also frequent. So it seems that the interests of those who have been translating articles from French and Spanish as part of this initiative have focused on arts, culture and politics rather than specialized topics such as science, technology, law and medicine. That may explain why articles are also visibly associated with French- and Spanish-speaking regions, demonstrated by the next two tables.

I created these two tables by consulting each of the 94 articles, except in cases where the article had been deleted and no equivalent article could be found in the Spanish or French Wikipedias (marked “unknown” in the tables), and I identified the main country associated with the topic. A biography about a French citizen, for instance, was counted as “France”, as were articles about French subway systems, cities and institutions. Every article was associated with just one country. Thus, when a biography was about someone who was born in one country but lived and worked primarily in another, I labelled the article as being about the country where that person had spent the most time. For instance, http://en.wikipedia.org/wiki/Manuel_Valls was born in Barcelona, but became a French citizen over thirty years ago and is a politician in France’s Parti socialiste, so this article was labelled “France.”

Country/Region Number of articles
France 24
Belgium 2
Cameroon 1
Algeria 2
Canada 1
Western Europe 1
Switzerland 1
Romania 1
n/a 1
Total: 34

Table 2: Primary country associated with translations from French Wikipedia

 

Country Number of articles
Spain 13
Mexico 10
Colombia 10
Argentina 7
Chile 3
Venezuela 3
Peru 2
Ecuador 2
Nicaragua 1
Guatemala 1
Uruguay 1
United States 1
Cuba 1
n/a 1
Unknown 4
Total 60

Table 3: Primary country associated with translations from Spanish Wikipedia

Interestingly, these two tables demonstrate a marked contrast in the geographic spread of the articles: more than 75% of the the French source articles dealt with one country (France), while 75% of the Spanish source articles dealt with three (Spain, Colombia and Mexico), with nearly equal representation for each country. The two tables do, however, demonstrate that the vast majority of articles had strong ties to either French or Spanish-speaking countries: only two exceptions (marked as “n/a” in the tables) did not have a specific link to a country where French or Spanish is an official language.

I think it’s important to keep in mind, though, that even though the French/Spanish translations in Wikipedia:Pages needing translation into English seem to focus on biographies, arts and politics from France, Colombia, Spain and Mexico, translation in Wikipedia as a whole might have other focuses. Topics might differ for other language pairs, and they might also differ in other translation initiatives within Wikipedia and its sister projects (Wikinews, Wiktionary, Wikibooks, etc.). For instance the WikiProject:Medicine Translation Task Force aims to translate medical articles from English Wikipedia into as many other languages as possible, while the Category: Articles needing translation from French Wikipedia page lists over 9,000 articles that could be expanded with content from French Wikipedia, on topics ranging from French military units, government, history and politics to geographic locations and biographies.

I’ll have more details about these translations in the coming weeks. If you have specific questions you’d like to read about, please let me know and I’ll try to find the answers.

Experimenting with Wikipedia in the classroom

Late last year, I came across a very insightful podcast series called BiblioTech on the University Affairs website. Each episode focuses on technology and higher education–Twitter in the classroom, for instance, or storage in the cloud–so of course I was immediately hooked. I had missed the first thirteen episodes, but they’re all quite short–usually between ten and fifteen minutes long–so I managed to catch up after two jogs and a commute to work.

Episodes 12 (Wikipedia) and 13 (Plagiarism) in particular piqued my interest and actually inspired me to change the format of the courses I’m teaching this term: an MA-level Theory of Translation and a BA-level Introduction to Translation into English course.

First, I listened to the Plagiarism episode, which mainly discussed how to design tests and assignments that discourage students from cheating. As host Rochelle Mazar, an emerging technologies librarian at the University of Toronto’s Mississauga campus, argued:

We need to create assignments that have students produce something meaningful to them, but opaque to everyone else.

Her suggestions included having students use material from the classroom lectures and discussions in their assignments (e.g. by blogging about each week’s lectures, and then using these blog posts to write their final paper), having students build on peer interactions via Twitter, Facebook or the course website to develop their assignments, or having students contribute to open-access textbooks through initiatives like Wikibooks.

I then listened to the Wikipedia episode, where Mazar made the following argument about why instructors should integrate Wikipedia into classroom assignments:

When people tell me that they saw something inaccurate on Wikipedia, and scoff at how poor a source it is, I have to ask them: why didn’t you fix it? Isn’t that our role in society, those of us with access to good information and the time to consider it, isn’t it our role to help improve the level of knowledge and understanding of our communities? Making sure Wikipedia is accurate when we have the chance to is one small, easy way to contribute. If you see an error, you can fix it. That’s how Wikipedia works.

Together, these two episodes got me thinking about the assignments I would be designing for my courses, and it didn’t take me long to decide that I would incorporate Wikipedia and blogging into my courses: translation of Wikipedia articles for the undergraduate translation course, and blogging as the medium for submitting, producing and collaborating on written work in the graduate theory course. Next month, I’ll write a post about how I decided to integrate blogs into my graduate theory class, but right now, I want to focus on Wikipedia and its potential as a teaching tool in translation classrooms.

But first, a short digression: A couple of years ago, I had students in my undergraduate translation classes work in group or partners to translate texts for non-profit organizations as a final course assignment. The students seemed to really like translating texts that would actually be used by an organization instead of texts that were nothing more than an exercise to be filed away at the end of term. And I enjoyed being able to submit a large project to a non-profit at the end of the term. But it was a lot of work on my part, mainly because I acted as a project manager by finding a non-profit with a text of just the right length and just the right difficulty, then splitting up the text for the class, correcting the final submissions, and finally translating the rest of the text, since the documents we were given to translate were inevitably too long for me to assign entirely to the students. So after two years, I went back to having students translate less taxing texts, like newspaper or magazine articles, since it’s easier to correct twenty translations of the same text than it is to correct twenty excerpts from a longer project. But I did miss the authentic assignments.

So, when I listened to the BiblioTech podcasts, I realized Wikipedia might be a good solution to the problem. Students can choose their own articles to translate (freeing me from the project-management aspect), and the wide variety of subjects needing translation–Wikipedians have tagged over 9000 articles as possible candidates for French-to-English translation–means we should be able to find something to interest everyone, and something just the right length for the assignment (around 300 words per student). I still expect to have to spend more time correcting the translations, but I think this will be less work overall than the previous projects.

As I was planning out the project, I was pleasantly surprised to discover that the Wikimedia Foundation has established an education program in Canada, the United States, Brazil and Egypt. The Canada Education Program is intended to help university professors integrate Wikipedia projects into their courses, and it offers advantages like an online ambassador for every class to help students navigate the technical challenges of editing in the Wikipedia environment. In addition, there’s an adviser who works closely with professors who join the program. Fortunately for me, he’s based in Toronto, which means I was able to chat with him earlier this month about the program. His recent article in the Huffington Post offers some good arguments for why Wikipedia is a useful classroom tool. He suggests, for instance, that since companies like the CIA use wikis in their work environments, students are likely to need to be familiar with wiki technology and culture after they graduate. In addition, students gain exposure by contributing to articles that are visible online, and they learn to engage in debates with classmates and Wikipedians as their contributions are reviewed and edited by others.

I’m still in the early stages of this experiment… I don’t yet know, for instance, whether students will have a lot of trouble editing their articles, and whether the technical challenges can all be solved by the online ambassador who will be working us. I’ve asked students to use Google Documents to do most of the translating work, but I’m expecting students to add the final versions to Wikipedia before the end of the term, so many of these problems may crop up only in March or April. I also expect a lot of in-class discussion about Wikipedia’s Translation Guidelines, which encourage omission of irrelevant information and adaptation or explanation of cultural references:

Translation between Wikipedias need not transfer all content from any given article. If certain portions of an article appear to be low-quality or unverifiable, use your judgment and do not translate this content. Once you have finished translating, you may ask a proofreader to check the translation.
[…]
A useful translation may require more than just a faithful rendering of the original. Thus it may be necessary to explain the meaning of terms not commonly known throughout the English-speaking world. For example, a typical reader of English needs no explanation of The Wizard of Oz, but has no idea who Zwarte Piet might be. By contrast, for a typical reader of Dutch, it might be the other way around.

Because students may find they have more freedom to make their own judgements about the relevance of information, I’ve asked them to do in-class presentation about their translation decisions and the experience of working in Wikipedia at the end of the term. I’ll be sure to post some of my own thoughts on this experiment after the term is over, the marking is complete and the translations are posted online. I’ll even post links to some of our work.

Has anyone else used (or thought about using) Wikipedia articles as translation assignments? If so, I’d certainly appreciate your comments.

Survey on crowdsourced translation initiatives launched

This weekend, I finally began sending out the invitations for the survey I’ve been preparing on crowdsourced translation initiatives. It asks respondents about their backgrounds, whether they have any formal training in translation, why they have decided to participate (or not to participate) in crowdsourced translation projects, and whether their participation has impacted their lives (e.g. whether they received job offers or met new colleagues because of their participation).

I’ve begun with Wikipedia, but I plan to invite respondents who have participated in other crowdsourced translation initiatives, including TedTalks, Kiva and Global Voices Online. I’ve just finished randomly sampling the Wikipedians who have helped translate content into English, but I will now start randomly sampling those who have translated from English into French, Spanish and/or Portuguese. I’m hoping to determine whether participant profiles differ from one project to another: for instance, does the average age of participants vary from one project to another? Do some projects seem to attract more people who have some formal training in translation? Do motivations differ when participants are translating for non-profit initiatives vs. for-profit companies?

Responses have started to trickle in, and I’m already starting to see some trends, but I won’t say anything more until all of the surveys have been submitted and I’ve had a chance to analyze the results. If you’re interested in finding out more details about the survey, please let me know. And if you want to see some of the results, check back in a few months. I expect to have some details to discuss by late March or early April.

Authenticity in the classroom

Last year, I posted a little about my experiment with community-service learning: having students translate texts for a non-profit organization so that they could use their learning to help out an organization that either does not have a budget for translation or which could use their translation budget to more directly support their cause. When we had finished our translation last year, I asked the students what they thought about working on more authentic texts like the HR manual we had just finished for Action contre la faim. Those who answered said that they were really happy their translations would actually be used for something and that they were helping a non-profit organization.

So this year, my Introduction to Translation into English students will be working in pairs to translate texts for Eau Vive, a non-profit organization based in France that works in Africa to help improve drinking water conditions. Next term, they’ll complete a group project similar to the one I assigned last year. Here’s what I learned from last year’s experiment and how I plan to improve the projects this year:

1. Google Docs
I’ve already posted two entries (here and here) about having students work with Google Documents to collaborate on their translations for Action contre la faim. Despite the disadvantages I noted in my second entry, I decided to have students work with Google Documents again this year. Two changes should help mitigate some of the disadvantages. First, the documents students will be translating are all separate pages for the Eau Vive website. One of the problems I had last year was that we were working on a large HR manual, which I had to divide up among the students and then paste back together again into a single file as I received the translations from each group. With 10 smaller, self-contained (yet closely related) texts (instead of a 40-page manual), I should be able to export the translations from Google Docs into a Word document and then make revisions more easily. Second, I’ve pointed out to students that Google Docs has an integrated chat feature, as many students last year complained that they had difficulty talking to their group members and working on the translation at the same time. Many of them didn’t notice the integrated chat feature because I hadn’t show it to them when I talked about using Google Documents. We’ll see whether students find collaboration a little easier this time around.

2. Progressive projects
Last year, I had students work in groups of 4-5 to translate anywhere from 1000-1250 words from the HR manual. I think this was more challenging than necessary because students had never really translated in groups before, and few of them had ever translated more than 250 words at a time. Many were reluctant to act as just revisers or just terminologists because they felt they should do some translating work. And I suspect that many found collaboration hard because they just weren’t used to working with other students on their translations. My hope is that by working in pairs during the fall term, students will be better prepared to work in groups in the winter. And, by translating 400-500 words, students will gradually get used to working on longer texts so that their group project won’t seem as intimidating.

I will be asking students about their experiences working with Google Documents this term, and I’ll write a post about whether there was a difference between last year’s group project and this year’s partner project. That should help show whether Google Docs is more effective for small groups or pairs of students, and whether students find it a more helpful tool for collaboration now that they know they can chat in real-time while working on their translations. Has anyone else used group or partner translation projects in their classes? If so, what were your experiences?

Getting ready for a new academic year

Earlier this week, as I was preparing my syllabus for my introductory translation into English class, I thought I should blog a little about the changes I’m making to the course this year, in case this might interest other instructors or professors. So here is an overview of what worked well in this course last year, and what I’m hoping will work better this year.

What worked well last year and is going to be in the course again: Participation questions
Last year, I added a new component to my classes: every week or so, I would ask students an open-ended question based on either the current week’s lecture or a podcast/news article/website with a translation-related theme. I then gave them five or six minutes to write down their answers on a piece of paper, and when they were done, I invited them share their thoughts if they so desired and hand in their answers at the end of class. The discussion questions serve as a way of taking attendance, since students get full marks for answering the question and no marks if they are not in class on the day we discuss it. I liked these questions for three reasons: first, it gave me a way to digress from the main course format (short lecture, then take up homework) while still focusing on translation, second, because it allowed me to take attendance without actually doing so, which helped ensure students regularly attended class, and third, because having students write down their answers gave them time to think about the question and made more of them volunteer their answers during the discussion.

What I hope will work better: New take-home assignments
In past years, I’ve assigned two take-home translations of about 250 words, which students work on individually and submit within two weeks for about 25% of their final grade. Last winter, though, I added a new assignment and had students work in small groups to translate a large text for a not-for-profit organization. Feedback from the students after the assignment was really positive: they enjoyed translating an authentic text and liked the fact that a non-profit organization was benefiting. So this year, I’ll be having students pair up in the fall term to translate about 500 words for another non-profit, and in the winter, they’ll work in groups of 3-5 on 1000-2000-word texts so that they get more experience collaborating with others and working on longer documents. I’ll also be able to see whether students prefer working with Google Documents when they collaborate with one other student or when they work in small groups. I also want to see whether they get more out of Google Docs if they have to use it for two projects in two semesters, rather than just a large group project in March and April, as students did last year.

What’s not going to make it into the course this year: Google Wave
Last year, I mentioned my plans for comparing Google Docs and Google Wave as collaborative tools for student group translations. Early last month, however, Google announced that it would no longer be developing Wave as a stand-alone product and would instead “extend the technology for use in other Google projects.” Although they promise to keep the website live until at least the end of the year–and thus probably for my entire Fall course–I don’t want to spend the time familiarizing students with an application that is being discontinued. It’s a shame, really, as I saw some good potential in Wave as a tool for students to collaborate online on their translations, and I hoped it would make up for some of the shortcomings students mentioned last year when I asked them how they liked working with Google Docs. I’ll have to see whether there are other tools I have students use for the group assignment in the Winter term, but for now, it will likely be Google Docs again, with some more specific instructions to the students, to help mitigate the inconveniences Google Docs caused me when I was marking the assignments.

The ethics of crowdsourcing

I’m almost finished my paper on translation blogs, and I’m getting ready to move on to my crowdsourcing projects. That’s why I was glad to hear that the editors of Linguistica Antverpiensia accepted my proposal for a special issue on community translation. Here’s what I plan to write about:

If, as Howe (2008: 8 ) argues, “labour can often be organized more efficiently in the context of community than it can in the context of a corporation[,] the best person to do a job is the one who most wants to do that job[,] and the best people to evaluate their performance are their friends and peers who […] will enthusiastically pitch in to improve the final product, simply for the sheer pleasure of helping one another and creating something beautiful from which they will benefit,” crowdsourcing raises some ethical questions. What, for instance, are some of the implications of for-profit companies benefiting financially from user communities who help create something from which not only the users will benefit but also the companies themselves? What effects might a user’s interest in project or commitment to a cause have on his or her translation? If crowdsourcing makes available translations that would otherwise not be produced or which would be available only after a long delay (e.g. translations into “minor” target languages, translations of less relevant texts, such as discussion forums), is this reward enough for the community, or do members deserve other forms of remuneration as well? What effects might these forms of remuneration have on community members, professional translators, non-profit and for-profit organizations, and users outside the community? Using examples of crowdsourced translation initiatives at non-profit and for-profit organizations, including Kiva, Global Voices Online, Asia Online, Plaxo and TEDTalks, this paper will explore various ethical questions that apply to translation performed by people who are not necessarily trained as translators or remunerated for their work. To better explore questions related to translation into major and minor languages, this paper will contrast the target languages offered through these crowdsourced initiatives with those offered via the professionally localized websites of five top global brands. It will also search for answers to these ethical questions by comparing the principles shared by the codes of ethics of professional translation associations in fifteen countries.

As I’ll be working on this paper between now and April 2011, I’d be very interested to hear from anyone who has worked on a community translation project, as a translator, an editor, developer, organizer, etc. What are your thoughts on the ethics of crowdsourcing? Leave me a comment or contact me over the next few months and let me know your point of view.

January 2012 Update: My article on the ethics of crowdsourcing is now available. It was published in Linguistica Antverpiensia 10 (2011), the theme of which was “Translation as a Social Activity–Community Translation 2.0.” The table of contents is available here.

Teaching with Google: My perspective

In my last post in this series I discussed what my students thought of using Google Documents to collaborate on their group translation projects. Now that I’ve (finally) finished marking the assignments, I thought I’d write a little about what I thought of working with the application to revise and mark the projects.

First, the advantages:

  1. Having the students share their documents with me meant I didn’t have to have my inbox cluttered with dozens of emails from students sending me various files: source text, target text, descriptions of their individual contribution to the project (all mandatory) and spreadsheets of their terminology or documents containing their group discussions (both optional but frequently submitted). The documents remained on Google’s server, so I didn’t have to download them and then attach them to emails to send the corrected files back to the students.

That’s all the good points I can think of right now. Unfortunately, what’s really on my mind at the moment are the disadvantages:

  1. Google Documents does not have a record/track changes feature. This was definitely the biggest drawback to using the application. Instead of being able to indicate the corrections just by turning on the track changes feature (which, in OpenOffice then automatically marks corrections in a different colour), I had to highlight each passage that needed revisions, then change the font colour manually and type in my correction. This took a lot of time, because it meant I had to revise each document twice: once in Google to show the students their mistakes and my corrections, and then once in the final document, which I was preparing in OpenOffice. And this leads me to the second disadvantage.
  2. Google doesn’t handle various advanced word-processing features very well. These include automatically generated table of contents, and tables, bullets, and lists, all of which were found in the 40-page text we were working on. Because of this, I wanted to produce the translation directly in OpenOffice, working directly on a copy of the original French source text file. However, unless I copied and pasted one paragraph at a time, the text I copied was pasted into my OpenOffice document within a frame, which would have ruined the formatting of the final document. So, for every paragraph, I had to copy the student translation, paste it into the OpenOffice document, make any necessary corrections, and then go back to the Google Document and mark up the text so that students could see what changes I was recommending. Very time-consuming.
  3. The default font for a Google Document seems to be a sans-serif typeface, which was not the font used in the French text. Most of the groups didn’t change the default font, so I had to do it myself when I copied and pasted the translation back together.
  4. Students didn’t always work directly in Google Documents. Some of them (as they explained in their reports) typed up their translations in Word, then copied and pasted the final versions into Google. And, when they revised their translations, they often did so as a group, with one person entering the revisions after everyone had agreed on what needed to be changed. This meant I usually couldn’t use the revision history feature of Google Documents to see which students had proposed which changes. I had expected the students to work primarily from home and to collaborate online through Google Documents, but they often got together in person, with one student logged onto a computer to either translate or revise the text based on group input. This wasn’t necessarily a Google-specific drawback, however; it just shows that students used the application in a way I didn’t expect.

Overall, I would have to say that I was surprised by the number of disadvantages of using Google Documents to mark and revise the student projects. While the students had mainly positive things to say about the application (with the exception of those who had a number of bullets or tables, which did not always convert properly to and from a Google Document), I found the experience inconvenient.

While I will have students use it again next year, I think I will have them email me a final version of their translation in .doc format so that I can then use the track changes feature to mark the translation and quickly add it to the final document. They can still share the rest of the files with me through Google, as this will reduce the number of files I need to upload and download, and help me spend less time editing.

Words in Transit

I spent some time thinking, recently, about internships opportunities for translation students. In a previous post, I discussed an article by Sébastien Stavrinidis outlining some of the challenges to arranging internships for students. I proposed a new type of internship where students would volunteer to translate texts for humanitarian organizations and professional translators would volunteer to revise these translations, allowing students to gain work experience without having to relocate… something that is currently difficult for anglophone students studying translation in Toronto. It also would be a way to apply crowdsourcing to internships: in a traditional internship, students are revised by one or two translators, but in this kind of internship, students would receive feedback from various revisers, and the program would grow as more translators, students and organizations agreed to participate.

While talking with some of my students a few weeks ago, after they’d submitted their group projects (a translation for Action Contre la Faim), I decided that I would really like to pursue my internship idea. The students who spoke to me described the translation project very positively: they were excited that their translations would actually be used (instead of just being filed away somewhere) and they also felt happy to have helped a humanitarian organization.

So last week, when I probably should have been more spending time marking essays and assignments, I launched the wordsintransit.org website, which will be the main forum for bringing together students, professionals and non-profit organizations. I’ve already contacted a few students about participating in the project, and I’m in the process of contacting other translators and Quebec-based NPOs. In about a month, I and two colleagues will start the non-profit organization that will operate the Words in Transit initiative, and we’ll run it for a year on an experimental basis. I’ll be blogging about the initiative both here and on the Words in Transit website, so check back soon for more details.

If you’re interested in participating in the initiative, please let me know. You’ll find more details about how you can get involved here.

Crowdsourcing: One of the top two threats to professional translators?

According to a recent recent article in Translorial, the journal of the Northern California Translators Association, the American Translators Association Board had just declared crowdsourcing one of the top two threats to the profession and the association. It was tied with the economic downturn.

A companion piece that was also part of the February 2010 issue of Translorial offers a brief summary—and a link to the video recording—of a talk from the 2009 general meeting of the Northern California Translators Association. The talk was entitled “New Trends in Crowdsourcing: The Kiva/Idem Case Study,” and it was given by Monica Moreno, localization manager at Idem Translations, and Naomi Baer, Director of Micro-loan Review and Translation at a not-for-profit microfinancing organization called Kiva. (Baer, incidentally, is also the author of the first Translorial article I cited).

Despite the ATA’s rather dour opinion of crowdsourcing, both the Translorial article and the presentation by Moreno and Baer offer a fairly positive view of the opportunities crowdsourcing provides not just to the companies that turn to volunteers for their translation needs, but also to web users, minority-language communities, and even professional translators. After all, as Moreno and Baer noted, languages that are considered Tier 2 or lower by corporations are often used in crowdsourcing initiatives. Just look at the TED Open Translation Project , one of the crowdsourcing initiatives cited in the presentation.

As of March 26, 2010, TEDTalks have been subtitled into more than 70 languages, including Swahili, Tagalog, Tamil, Icelandic and Hungarian. More than 400 talks have been subtitled in Bulgarian, nearly 300 in Arabic, and more than 200 in Romanian, Polish and Turkish. And these figures compare favourably with traditional Tier 1 languages: French (304 talks), Italian (263 talks), German (195 talks) and Spanish (575 talks). By comparison, large localization projects by commercial organizations don’t usually offer as many languages: Of Google, Microsoft and Coca-Cola, which topped the 2009 Global Brands ranking published in the Financial Times, Coca-Cola appears to have been localized for the most country and language pairs, with a whopping 124 countries and 141 locales, while Microsoft is a close second at 124 locales. However, many of the links to Coca-Cola sites (e.g. nearly all of the 44 African locales) actually take users to the US English site, so Coke probably offers closer to just under 100 locales, many of which (e.g. 13 of the 30 Eurasia locales) are actually English-language versions. Likewise, IBM, the fourth-ranked brand, offers 100 locales, but 49 of them are English-language versions, and another 10 are in Spanish. So, while some of the largest brands initially appear to have targeted more linguistic groups, the TEDTalks have actually been made available in more languages.

In addition, smaller linguistic communities within a region are not often targeted by the larger corporations, as these groups may not have the purchasing power to justify translation costs. Microsoft, Coca-Cola, IBM, Procter & Gamble, and Ikea, for instance, all offer their Spain websites only in Spanish, while some TED videos (as well as Google) are available in Catalan and Galician. With non-profit initiatives, where users may feel driven to contribute their time to support a particular cause or to make available information (like the TED talks) that would otherwise be inaccessible to those who don’t speak the source language, crowdsourcing can help reduce the language hierarchy that for-profit localization initiatives encourage: the translations are user-generated and sometimes user-initiated, so as long as enough members of a community feel committed to making information available, they will provide translations into so-called major and minor languages without worrying about a return on investment. What we need now, then, is more research into the quality of the translations produced by volunteer, crowdsourced efforts. Making information available in more languages is laudable, but if the translations are inaccurate, contain omissions or have added information, then the crowdsourcing model may not be as advantageous as it appears.

The presentation by Moreno and Baer also offered a few insights into the motivations of volunteer translators: some wanted to give back to the community, others wanted to mentor student or amateur translators without having to make a significant time commitment, while others saw it as a networking opportunity. As Baer noted, her volunteer efforts for Kiva eventually landed her a paid job with the organization. These anecdotal details about translator motivations underscored (at least for me) the need to systematically research the motivations of the people involved in crowdsourced translation projects. I think it’s worth comparing the motivations of those involved in non-for-profit initiatives like TED, Kiva, or Global Voices (which I’ve discussed in a previous post) and those involved in initiatives launched by for-profit companies such as Facebook. I suspect that motivations would differ, but a survey of the volunteers could confirm or refute this hypothesis.

Overall, the presentation by Moreno and Baer is definitely worth watching if you’re at all interested in crowdsourcing and translation. It’s available on Vimo at this address: http://vimeo.com/8549171.