Translation internships

This past June, I attended a conference in Edmonton that focused on Canadian labour and social movements. Although I enjoyed the conference quite a bit and was particularly intrigued with how academics, labour unions, social activists (and even a few “trouble makers”) gathered together at the same event to share research, stories, and calls to action, I didn’t blog about it earlier because it wasn’t a translation-focused event. (My talk, which was about how and why Left-leaning Canadians translated Quebec sovereignty texts into English in the 1970s, 80s and 90s, was the only one that looked at translation). Yet the conference did lead me to carefully consider (and sometimes reconsider) various social issues that are relevant to translation. If you’ve read my blog posts before, you’ll know, of course, that crowdsourcing, and the ethics of crowdsourcing in particular, is one of my research interests, so it shouldn’t come as a surprise that the conference had me once again pondering the effects of unpaid or underpaid labour on the translation profession. But later, as the new academic year started and I took over our various internships programs, I started to also think about ethical questions related to internships for translation students.

The problem that I wrestle with as I try to find internships is that generally, I can’t offer an internship to every student who applies for one. This is usually because although many Toronto-area translation companies have been generously taking on interns for a number of years, many smaller companies and individual translators (who make up a large portion of the translation market) find working with an intern difficult to coordinate: how, for instance, can they be sure they will have work for the intern every week? How will supervising an intern fit into their schedules, which might be full due to travel and other commitments? Since there are only so many large translation service providers who might be able to accommodate a intern or two, this leaves me having more students than internships nearly every term.

And this problem is actually more complex, because although about half of our internships are paid, the other half are not, leaving students who cannot afford to do unpaid work unable to take advantage of an opportunity to get more translation experience, and leaving me with the problem of offering paid work to some students and unpaid work to others. These kinds of challenges (unpaid internships, and internships that are not available to everyone) are not unique to translation, as this article in Canadian Lawyer demonstrates, but they are something I’ll be spending more time thinking about. I may be overestimating the interest in internships: some students are pursuing a degree in translation without intending to become translators and therefore may not be particularly interested in professional experience in the field. Others may not be concerned about unpaid work because they value the experience an internship provides. To find out more, I plan to survey our students later this year, to see what they think about internships, both paid and unpaid. My hope is that the results will help me feel more confident that the internship programs I coordinate are benefitting as many students as possible, as fairly as possible. I’ll blog more about it when I have some of the results.

In the meantime, I’d be very interested in hearing from others who run an internship program: what has your experience been with the internships? What do students think about unpaid work? Are internships mandatory, and therefore equally accessible to everyone? Do you have any ethical concerns about translation internships? If so, do you have any ideas about how to address these concerns?

Summer breaks and productivity

After presenting papers at three conferences in St. Catharines, Edmonton, and Barcelona in May, June and July, I spent the past two months on activities largely unrelated to my research, teaching or translating. I read six novels (more than I’ve managed to read in the last three years combined). I repainted parts of the house. I weeded and watered the vegetable garden in our backyard. And I spent a lot of time at playgrounds, splash pads and parks with my children. I did finish correcting two other articles that will soon appear in print, but aside from a book review, I didn’t submit any new texts to journals. In short, I’ve enjoyed the past two months, which were the first extended break I’ve taken from academic activities in several years.

And yet, I’ve still wrestled with the thought that I should be more productive. From time to time, I’ve wondered whether I will regret not getting further along in the survey I’m designing to find out more about what our undergraduate students think about internships. On more than one occasion, I got annoyed that I hadn’t returned to the major research project I’ve been tackling on and off for the past three years. I should be transcribing interviews! Visiting archives! Blogging! And, as the start of the Fall term has drawn closer, I’ve regretted not having the syllabi for all three courses completely finalized.

But working just one day a week for the past eight weeks has taught me that I can spend part of my summer break finishing off just the most pressing tasks without dire consequences. And in those moments when twinges of guilt did make themselves felt, I asked myself whether I would rather my children think of me as the mother who published a thirteenth article this summer, or as the mother who helped them make popsicles with the raspberries from our garden, who guided them through making a jar of pesto with our basil leaves, who showed them how to toss a salad made from the tomatoes that just a few months ago were tiny seeds in our kitchen. I’ll work on that thirteenth article in the fall, and it probably won’t matter at all that I didn’t start it sooner. In fact, I may even try this again next year.

Should I fix this mistake or not? On the ethics of researching Wikipedia

I’ve just finished some of the final edits for an article that will soon be published in has just been published in Translation Studies, and it reminded me that I’ve been meaning to write a blog post about an ethical dilemma I faced when I was preparing my research. So before I turn to a new project and forget all about this one again, here’s what happened.

The paper focuses a corpus of 94 Wikipedia articles that have been translated in whole or in part from French or Spanish Wikipedia. I wanted to see not just how often translation errors in the articles were caught and fixed, but also how long it took for errors to be addressed. It will probably not come as any surprise that almost all of the articles I studied initially contained both transfer problems (e.g. incorrect words or terms, omissions) and language problems (e.g. spelling errors, syntax errors), since they were posted on Wikipedia:Pages needing translation into English, which lists articles that are included in English Wikipedia but which contain content in another language, content that requires some post-translation editing, or both. Over the course of the two years leading up to May 2013, when I did the research, some of the errors I found in the initial translations were addressed in subsequent versions of the articles. In other cases, though, the errors were still there, even though the page had been listed as needing “clean-up” for weeks, months, or even years.

And that’s where my ethical dilemma arose: should I fix these problems? It would be very simple to do, since I was already comparing the source and target texts for my project, but it felt very much like I would be tampering with my data. For instance, in the back of my mind was the thought that I might want to conduct a follow-up study in a year or two, to see whether some of the errors had been resolved with more time. If I were to fix these problems, I wouldn’t be able to check on the status of these articles later, which would prevent me from finding out more about how quickly Wikipedians resolve translation errors.

And yet, I was torn, partly due to a Bibliotech podcast I’d listened to a few years ago that made a compelling argument for improving Wikipedia’s content:

When people tell me that they saw something inaccurate on Wikipedia, and scoff at how poor a source it is, I have to ask them: why didn’t you fix it? Isn’t that our role in society, those of us with access to good information and the time to consider it, isn’t it our role to help improve the level of knowledge and understanding of our communities? Making sure Wikipedia is accurate when we have the chance to is one small, easy way to contribute. If you see an error, you can fix it. That’s how Wikipedia works.

In the end, I didn’t make any changes, but this was mainly because I didn’t have the time. I didn’t want to tamper with my data while I was writing the paper, and after I had submitted it, I didn’t get around to going back through the list of errors I’d compiled to starting editing articles. Most of the corrections would have been for very minor problems, such as changing a general word (“he worked for”) to a word that more specifically reflected the source text (“he volunteered for”), or changing incorrect words for better translations, although the original version would have given users the gist of the meaning (e.g. “the caves have been exploited” vs. “the caves have been mined”). I had trouble justifying the need to invest several hours correcting details that wouldn’t really affect the overall meaning of the text, and yet this question still nagged at me. So I thought that instead I would write a blog post to see what others thought: what is more ethical, making the corrections myself, or leaving the articles as they are, to see how they change over time without my influence?

Some of my favourite talks from the CATS conference at Brock University

I’ve just returned from the 27th annual conference organized by the Canadian Association for Translation Studies, which was held at Brock University in St. Catharines, Ontario this year. The theme was “Translation: Territories, Memory, History”, and although a number of the talks addressed topics you might expect to find in this theme, namely the history of translated texts in regions like Asia, Latin America and Brazil, others were more broadly related, addressing subjects like the history of language technologies in Canada, or “new territories” like fansubbing norms. Since many of these topics are likely to interest to people who weren’t able to attend, I thought I would summarize some of my favourite presentations and offer a few thoughts on the wider implications of these research questions. Very roughly, the talks I most enjoyed can be grouped into three broad, and somewhat overlapping, categories that also match my own research interests: technological, professional and pedagogical concerns.

Technological Concerns

Two talks on technology-related topics were particularly intriguing: Geneviève Has, a doctoral candidate at Université Laval, spoke about the history of language technologies in Canada, focusing particularly on the role of the federal government in projects like TAUM-MÉTÉO, the very successful machine-translation system for meteorology texts, and RALI, a lab that developed programs like the bilingual concordancer TransSearch. Has explored some of the reasons why entire research labs or specific research projects had been dismantled, and noted that when emphasis is placed on producing marketable results within a set period of time, funding is often pulled from projects if the results are not what the funders are looking for, even if useful research is being produced by the lab. For instance, the quest to develop a machine translation system as successful as TAUM-MÉTÉO led to later systems being abandoned when the results were not as impressive.

Valérie Florentin, a doctoral candidate at the Université de Montréal, meanwhile, gave a fascinating talk on fansubbing norms, noting that in the English to French community she studied, online forum discussions between the fansubbers showed how they wanted to ensure the subtitles would be easily understood by francophones in various countries. Thus, they avoided regionalisms as well as expressions and cultural references they thought typical viewers would not understand. They also followed style guidelines to ensure the subtitles, on which various people had collaborated, would be consistent in terms of things like whether characters should use tu or vous to address one another. In her conclusions, she wondered whether the collaborative model used by this fansubbing community (in which about eight people translate and review the subtitles for any given episode) could be useful in professional communities. Recognizing that it would be unfeasible to expect companies to pay this many people to work on a project (even if each person was doing less work than they would if they prepared the subtitles alone), she argued that the model could be useful in training contexts, allowing students to debate with one another about cultural concerns and equivalents, while also following a set of style guidelines to ensure consistency in the final product. I found this suggestion particularly relevant to my own teaching, since I like to try collaborative models with my students, and since I have argued in other talks that crowdsourcing models often offer elements that could be adopted in professional translation, such as greater visibility for the translators who work on projects.

Professional Concerns

Marco Fiola, from Ryerson University and Aysha Abughazzi, from Jordan University of Science and Technology, both spoke on translation quality. While Marco’s presentation explored competing definitions of translation quality and specifically addressed issues like understandability and usability, Aysha spoke about translation quality in Jordan, discussing the qualifications of translators and the quality of translations she obtained from various agencies. Both of these talks underscored for me the difficulty translators and translation scholars continue to have when defining quality and in determining what “professional” translation should look like.

Pedagogical Concerns

Philippe Caignon, an associate professor at Concordia University, offered an excellent presentation on concept mapping and cognitive mapping, illustrating how these can be useful for students in terminology courses as an alternative to tree diagrams. Although he didn’t show the software itself, he did mention that Cmap Tools can be used to create concept maps fairly easily. As I listened to his talk, I decided I could incorporate concept mapping into the undergraduate Theory of Translation course I usually teach, to help students think about the terms translation and translation studies. I think examples like this one would help students see how they can visualize translation, and if they had a few minutes to work on their concept map individually before discussing their map with the rest of the class, I think we would be able to explore the different ways translation can be understood. More on this after I’ve tried it out in class.

Wikipedia translation projects: Take 2

Last year was the first time I assigned a Wikipedia translation project in my Introduction to Translation into English course, and I was happy enough with the experience that I tried it again this year. Now that I’m more familiar with Wikipedia, I was able to change the assignment in ways that I hope improved both the student experience and the translations we produced. Here’s an overview of how I modified the assignment this year and what students thought about the project:

Overview of the assignment

For this assignment, students are required to work in groups to translate an article they choose. Like last year, I recommended they select from this list of 7000+ articles needing translation from French into English. Also like last year, as part of the assignment, students had to submit a report about how they divided the work and made their translation decisions. Finally, they had to do a presentation in front of the class to show their finished article and explain the challenges they faced when translating it.

First change: More training on Wikipedia policies and style guides

This year, I spent more class time talking about Wikipedia policies and discussing what would make a “good” English article. During our second week of class, for instance, we covered Wikipedia’s three Wikipedia’s Core Content Policies: neutral point of view, no original research, and verifiability of sources. I asked students to consider how this might affect the articles they chose to translate, and reminded them that they should be aware that even a single word in the source text (e.g. “talented”, “greatest”, “spectacular”) could run counter to these three policies. Last year, for instance, the group translating an article about a historic site in France found adjectives like “spectacular” and “great” used in the French article to describe a tower that stood on the site. In their translation, they deleted these adjectives, because they found them too subjective. After we discussed this example, I asked students to think of other evaluative words they might encounter in their source texts, and then we came up with some strategies for addressing these problems in their translations, including omitting the words and finding a reliable secondary source to quote instead (“X and Y have described the tower as ‘spectacular’).

On Weeks 3 and 4, we took a closer look at the Wikipedia Manual of Style, and in particular, at the Manual of Style for articles about the French language or France and the Manual of Style for Canada-related articles. Though students could choose to translate articles on French-speaking regions other than France and Canada, only those two French-speaking countries have their own style guide. I pointed out the recommendations for accented characters and proper names and we discussed what to do in cases where no rule existed, or where considerable controversy continues to exist, as is the case for capitalization of French titles and expressions. In this case, we created our own rule (follow typical English capitalization rules), but students could still choose to do something else: they just had to justify their decision in the commentary accompanying their translation.

Second change: revised marking scheme

Last year, I’d intended to mark the translations just like any other assignment: I told students I would give them a grade for the accuracy of their translation, based on whether they had any errors like incorrect words and shifts in meaning, and a grade for English-language problems like grammar errors, spelling mistakes, and ambiguous wordings. But a good Wikipedia article also needs to have hyperlinks to other articles, citations to back up any facts, and various other features that are mentioned in the Manual of Style. My marking scheme from last year couldn’t accommodate these things. This year, I marked the translations out of 50, broken down as follows: 15 marks for the accuracy of the translation, 15 marks for the language, 10 marks for conforming to the Manual of Style and adding relevant hyperlinks to other Wikipedia articles, 5 marks for citing references and ensuring hyperlinks are functional, and a final 5 marks for ensuring the translation is posted to Wikipedia, with the corrections I suggested. I also had students submit their translations earlier so I could start marking them before the end of the semester, giving them time to post their final versions before the course was over. Together, these changes made the assignment work much better, and I noticed a big improvement in the quality of the final articles.

Student reactions to the assignment

At first, some students were very nervous about working within the Wikipedia environment. In the first week of class, when I asked how many had ever edited a Wikpipedia article, no one raised their hand. As the weeks went on, I heard comments from the groups about how they needed to spend some time figuring out the markup language, and how to use the sandbox, but by the end of the term, everyone succeeded in posting their translations online.

During their presentations this week, some students even noted that the markup language was fairly easy to learn and that they were glad to have more experience with it because it’s a tool they might need to use in the future. As I’d hoped, many students discovered that researching an article is a lot of work and that just because you’re interested in a topic doesn’t mean it will be easy to translate an article about it. Some students commented that adapting their texts to an English audience was challenging, particularly when English sources about the people and places they’d chosen to write about weren’t readily available. And nearly all of them felt the assignment has made them look at Wikipedia more critically: some students said they would check how recently an article had been updated (since their French article had out-of-date tourism statistics, for instance, or dead hyperlinks), while others said they would be looking to see whether the article cited reliable sources.

Not all of the translations have been corrected and posted online yet, but here are a few that have. I’ll update the list later, when everyone’s done: [List updated April 19]:

  • Aubagne (Students translated the “History”, “Politics” and “Environment and Environmental Policies” sections)
  • Fundy National Park (Students translated the “Natural Environment” and “Tourism and Administration” sections)
  • Louis Calaferte (Students translated the introduction, along with the “early life” and “Career” sections)
  • Lyonnaise cuisine (Students translated the “Terroirs and culinary influences” and “The Mères” sections)
  • Die2Nite

On academic blogging

Some recent online articles weighing the pros and cons of academic blogging and academic publishing more broadly led me to reflect on my own reasons for blogging over the past 4 1/2 years.

One of the concerns academic bloggers have mentioned is that the writing they do for their blogs does not count as academic research: the posts are not peer-reviewed, so they will typically be counted as professional service rather than research in tenure and promotion assessments, even though blogs–being freely accessible online–are likely to reach a wider audience than a typical academic journal article. As one blogger noted, any time spent writing her blog was time not spent writing a peer-reviewed essay or a book that would “count” as research. And this is certainly something I have considered as well.

When I started this blog in 2009, I had a lot more time on my hands: I had just finished my PhD, was getting ready to teach three courses in the next academic year, and was looking forward to finally being able to write short posts in a single sitting, rather than trying to plow through a major project like a dissertation. Not unexpectedly, I posted much more actively than I did last year, for instance, when I taught five courses, wrote three journal articles and edited the book review section of another journal. But I still enjoy blogging, even if I don’t have as much time for it. And, in case any other academics are trying to decide whether it’s worth starting a blog, here’s a few reasons why I continue to post articles on this one:

  1. First, this blog has helped me connect with many people I would probably not otherwise have met: other researchers, of course, but also graduate students and non-academics from around the world. Over the last four years, several thousand people have visited the site. Some bloggers, of course, can attract that many visitors in a much shorter period, but I don’t have the time to write content more frequently and to promote the website more efficiently. And I’m happy with my readership figures: without this blog, I would not have been able to reach several thousand people who were otherwise interested in the topics I write about.
  2. Second, the blog is a great way to archive things I’m likely to want to look up again later. For instance, because I try to write about the conferences I’ve attended, I’m able to go back months or years later and double-check who said what at which event. I can also review what I was doing in my classes a few years ago and what I thought about it at the time. Without the blog, I probably wouldn’t have that kind of information at hand, since my conference notes would likely have ended up somewhere among the many stacks of papers covering my desk and filing cabinets.
  3. If, like me, you integrate your blog into a website (and WordPress allows you to do so very easily), you can also keep your CV up to date and provide links to (or full versions of) your articles. I realize that you could also do this via sites like Academia.edu, but I like having my own site, which gives me more control over the layout, structure, and kind of content I would like to include.
  4. Finally, with a blog, you can post material you’ve had to cut from longer papers but wouldn’t be able to develop into another full-length article. You can also work out ideas for projects you might later develop into a larger project, or reflect on topical issues that you’re never going to have time to develop into a full-length article. If you use your blog in this way, as I sometimes do, it becomes an extension of your writing activities, fodder for new work, and a platform to test out new ideas rather than a side-project taking you away from your “real” research.

These are my primary motivations for blogging, but I’m sure other bloggers could add more reasons to this list. In case you’d like to read other blogs about translation written from an academic’s perspective, here are a few of the blogs I follow that are written by people who are or were actively involved in Translation Studies:

Know of others? I’d be happy to update the list.

Integrating blogs into a Translation Studies course

At the CATS conference in May 2010, I attended a presentation by Philippe Caignon, who talked about his experience integrating blogs into a terminology course. (Incidentally, Philippe has just one of the prestigious 3M National Teaching Fellowships, an honour he richly deserves, and which you can read about here). After the conference, when I finally got around to writing a post about Philippe’s presentation, I resolved to add a blogging component to at least one of my courses in the next academic year. I felt that doing so would expose students to a platform they might use after graduation, since they might be maintaining a company blog, translating blog postings, or creating and sharing their own blogs. I also thought it would provide us with more flexibility, allowing students to reflect on the coursework and exchange ideas outside of the classroom. Although I never wrote a follow-up post, I did, in fact, integrate blogs into the MA-level Translation Studies course I taught in 2011. Since then, I’ve taught the course twice more, and I’ve made some changes to the way I incorporate blogging activities. I thought I would share some of the things I’ve learned in case others are considering adding a blogging component to their courses. I’ll focus on three aspects:

  • Blogging platforms
  • Designing the assignments
  • Grading

Blogging platforms

Although Learning Management Systems like Moodle and Blackboard have integrated blogging tools, I wanted to have students work with a platform they’d be likely to use again outside of the classroom. So I asked students to create their own blogs using WordPress or Blogger, and then and send me the URL so I could add a link to these blogs on the course website. This solution has worked out well: the blogs are easy to find (since they’re all listed in the “blogroll” of the course website), students can express themselves more creatively because they can customize the look and feel of their blogs, and I don’t have to deal with emailed assignments and incompatible file formats because every graded assignment must be submitted as a blog post.

So what about privacy then? You might be wondering why students would want to share all their coursework with a) everyone on the Internet and b) everyone else in the class. The answer to the first issue is easy: blogs don’t have to be visible to search engines, nor do they have to be accessible to every Internet user. Once I told students they had to create their own blogs, I made sure to explain how to adjust the privacy settings so that the blog remained invisible to search engines and/or could be accessed by invitation only. Most students chose to make their blogs invisible to search engines, because if they made their blogs private, they would have to email invitations to everyone else in the class. I did mention, though, that they could always change the privacy settings once the course was over, making their blogs as accessible or inaccessible as they wanted.

As for the second issue, whether students might be reluctant to share their coursework with their classmates, I invited everyone to use pseudonyms. Some students liked this option, and named their blogs something like “Translation Studies 5100” or “Glendon Translation Student.” Others didn’t seem to mind either way and used their real names. I also informed everyone that when commenting on their classmates’ blogs, they had to be respectful and constructive, rather than negative. To date, I haven’t had any problems with inappropriate comments. Generally, students have found the feedback from their peers very helpful. In fact, many of the comments offered a perspective very different from mine: details about cultures and languages with which I’m unfamiliar, references to sources I hadn’t seen, etc. And as one student mentioned to me last year, students are able to get a better sense of how they compare to their classmates, in terms of their writing skills, their background knowledge and their familiarity with theoretical texts, which can give them greater confidence in their own skills or alert them that they may need to do some catching up.

Designing the assignments

A big mistake I made the first time I assigned blogging as part of the coursework was not indicating specific deadlines for the blog posts. Although students were required to post five critical reflections on the assigned readings, I didn’t assign a specific due date for each post because I wanted to provide some flexibility about which readings the posts could cover. Unfortunately, most of the students procrastinated and posted nothing until the last week of the semester, leaving their classmates with very little to comment on (more on that in a minute). Ever since then, I’ve assigned fewer blog posts (just two critical reflections this year), and I’ve also set specific due dates for these posts: the first is due on Week 4 of our 13-week course, and the second is due on Week 8. These deadlines still allow students to choose which course readings they want to comment on in their post, but it also ensures they are submitting their posts throughout the semester rather than at the end.

As part of the coursework, students are required to comment on at least six different blog posts over the course of the semester. This means they can read six different blogs and leave comments on each one, or they can leave several comments on just two or three blogs. After my experience the first year, I’ve set deadlines here as well: comments are due by Weeks 6 and 10, though of course everyone is welcome to leave comments at any time. And based on some of the advice Philippe gave during his presentation, I also require students to respond to the comments they receive from their peers: this helps maintain a dialogue rather than a one-way discussion.

Grading

The critical reflections, along with all the other coursework (like an annotated bibliography and the final paper) are submitted via the blog and are all marked in the same way I’d grade a traditional paper: based on the clarity of the argument, the relevance of the examples, the extent of the documentation, etc. I send students an individual email with my feedback and their grade because I don’t feel this is something that should be shared with everyone.

As for the comments, I assign a mark for completion, provided the comment meets the standards I set out in the syllabus (i.e. it offers thoughtful constructive criticism that also highlights some of the argument’s strengths). At the end of the term, I tally up the number of comments and replies, award an A+ to any student(s) who went beyond the requirements, A’s to the students who left the required number of comments and replies, B’s to the students who missed a few, and so on. In total, comments are worth 15% of the final grade for the course (10% for comments and 5% for replies).

Overall, I think blogs are a useful tool to integrate into the classroom. Although this was a graduate course, Philippe’s presentation focused on his experience with an undergraduate class, so blogs can definitely be used in a variety of contexts to achieve multiple learning objectives, include peer collaboration, asynchronous discussions, and critical reflections on the coursework.

What translators can learn from the F/OSS community

Looking through my blog archives late last year, I was disappointed to discover I’d posted only seven articles in all of 2013: usually my goal is to get at least one post up every month, and last year was the first time since 2009 that I hadn’t been able to achieve that. So my goal for this year is to blog more frequently and more consistently. And with that, here is my first post of 2014:

In November, I came across a blog post that hit on a number of issues relevant to the translation industry, even though it was addressed to the Free/Open-Source Software (F/OSS) community. It’s called The Ethics of Unpaid Labour and the OSS Community, and it appeared on Ashe Dryden’s blog. Ashe writes and speaks about about diversity in corporations, and so her post focused on how unpaid OSS work creates inequalities in the workforce. As she argues, the demographics of OSS contributors differs from that of proprietary software developers as well as the general population, with white males overwhelmingly represented among OSS contributors: One source Ashe cites, for instance, remarks that only 1.5% of F/OSS contributors are female, compared to 28% of contributors for proprietary software. Ashe notes that lack of free time among groups that are typically marginalized in the IT sector (women, certain ethnic groups, people with disabilities, etc.) is the main reason these groups are under-represented in OSS projects.

These demographics are problematic for the workforce because many software developers require their employees (and potential new hires) to have contributed to F/OSS projects. And while some large IT firms do allow employees to contribute to such projects during work hours, people from marginalized groups often do not work at these kinds of companies. This means people who would like to find paid employment as software developers probably need to be able to devote unpaid hours to F/OSS projects so they have a portfolio of publicly available code for employers to consult.

So how is this relevant to translators, or the translation industry? It’s relevant because the same factors affecting the demographics of the F/OSS community are also likely to affect the demographics of the crowdsourced translation community. People can volunteer to translate the Facebook interface only if they have free time and access to a computer; likewise, people with physical disabilities that make interacting with a computer difficult are likely to spend less time participating in crowdsourced projects than people with no disabilities. And since, in many cases, the community of translators participating in a crowdsourced project will largely determine how quickly a project is completed, what texts are translated and what language pairs will be available, the profile of participants is important.

Unfortunately, we don’t have a lot of data about the profiles of people who participate in crowdsourced (or volunteer) projects. The studies that have been done do hint at a larger question worth exploring: O’Brien & Schäler’s 2010 article on the motivations of The Rosetta Foundation’s volunteer translators noted that the group of translators identifying themselves as “professionals” was overwhelmingly female (82%), while the gender of those identifying themselves as amateurs was more balanced (54% female). The source languages of the volunteers were mainly English, French, German and Spanish. My own survey of Wikipedia translators found that 84% of the respondents were male and 75% were younger than 36. Because both these projects show that people with certain profiles participated more than others, it’s clear there’s a need for more research. If we had a better idea of the profiles of those who participate in other crowdsourced translation projects, we would be able to get see whether some projects seem more attractive to one gender, which language pairs are most often represented, and what kinds of content is being translated for which language communities. And we could then try to figure out whether (and if so, how) to make these projects more inclusive.

Since it’s still a point of debate whether relying on crowdsourcing to translate the Twitter interface, a Wikipedia article or a TED Talk is beneficial to the translation industry, let’s leave that question aside for a moment and just consider the following: one of the benefits both novice and seasoned translators are supposed to be able to reap from participating in a crowdsourced project is visible recognition for their work. Online, accessible users profiles are common in crowdsourcing projects (as in this example from the TED translator community), and translators are usually given visible credit for their contributions to a project (as in this Webflakes article, which credits the author, translator and reviewer). If certain groups of people are more likely to participate in crowdsourced projects, that means this kind of visibility is available only to them. If we take a look at where the software development industry is headed (with employers actively seeking those who have participated in F/OSS projects, putting those who cannot freely share their code at a disadvantage), translators could eventually see a similar trend, putting those who are unable (or who choose not) to participate in crowdsourced projects at a disadvantage.

I think this is a point worth considering when crowdsourced projects are designed, and it’s certainly a point worth exploring in further studies, as it raises a host of ethical questions that deserve a closer look.

Translation in Wikipedia

I’ve been busy working on a new paper about translation and revision in Wikipedia, which is why I haven’t posted anything here in quite some time. I’ve just about finished now, so I’m taking some time to write a series of posts related to my research, based on material I had to cut from the article. Later, if I have time, I’ll also write a post about the ethics and challenges of researching translation and revision trends in Wikipedia articles.

This post talks about the corpus I’ve used as the basis of my research.

Translations from French and Spanish Wikipedia via Wikipedia:Pages needing translation into English

I wanted to study translation within Wikipedia, so I chose a sample project, Wikipedia:Pages needing translation into English, and compiled a corpus of articles translated in whole or in part from French and Spanish into English. To do this, I consulting recent and previous versions of Wikipedia:Pages needing translation into English, which has a regularly updated list split into two categories: the “translate” section, which lists articles Wikipedians have identified as having content in a language other than English, and the “cleanup” section, which lists articles that have been (presumably) translated into English but require post-editing. Articles usually require cleanup for one of three reasons: the translation was done using machine translation software, the translation was done by a non-native English speaker, or the translation was done by someone who did not have a good grasp of the source language. Occasionally, articles are listed in the clean-up section even though they may not, in fact, be translations: this usually happens when the article appears to have been written by a non-native speaker of English. (The Aviación del Noroeste article is one example). Although the articles listed on Wikipedia:Pages needing translation into English come from any of Wikipedia’s 285 language versions, I was interested only in the ones described as being originally written in French or Spanish, since these are my two working languages.

I started my research on May 15, 2013, and at that time, the current version of the Wikipedia:Pages needing translation into English page listed six articles that had been translated from French and ten that had been translated from Spanish. I then went back through the Revision History of this page, reading through the archived version for the 15th of every month between May 15, 2011 and April 15, 2013: that is, the May 15, 2011, June 15, 2011, July 15, 2011 all the way to April 15, 2013 versions of the page, bringing the sample period to a total of two years. In that time, the number of articles of French origin listed in either the “translate” or “clean-up” sections of the archived pages to a total of 34, while the total number of articles of Spanish origin listed in those two sections was 60. This suggests Spanish to English translations were more frequently reported on Wikipedia:Pages needing translation into English than translations from French. Given that the French version of the encyclopedia has more articles, more active users and more edits than the Spanish version, the fact that more Spanish to English translation was taking place through Wikipedia:Pages needing translation into English is somewhat surprising.

Does this mean only 94 Wikipedia articles were translated from French or Spanish into English between May 2011 and May 2013? Unlikely: the articles listed on this page were identified by Wikipedians as being “rough translations” or in a language other than English. Since the process for identifying these articles is not fully automated, many other translated articles could have been created or expanded during this time: Other “rough” translations may not have been spotted by Wikipedia users, while translations without the grammatical errors and incorrect syntax associated with non-native English might have passed unnoticed by Wikipedians who might have otherwise added the translation to this page. So while this sample group of articles is probably not representative of all translations within Wikipedia (or even of French- and Spanish-to-English translation in particular), Wikipedia:Pages needing translation into English was still a good source from which to draw a sample of translated articles that may have undergone some sort of revision or editing. Even if the results are not generalizable, they at least indicate the kinds of changes made to translated articles within the Wikipedia environment, and therefore, whether this particular crowdsourcing model is an effective way to translate.

So what are these articles about? Let’s take a closer look via some tables. In this first one, I’ve grouped the 94 translated articles by subject. Due to rounding, the percentages do not add up to exactly 100.

Subject Number of French articles Percentage of total (French) Number of Spanish articles Percentage of total (Spanish)
Biography 20 58.8% 20 33.3%
Arts (TV, film, music, fashion, museums) 3 8.8% 8 13.3%
Geography 2 5.8% 12 20%
Transportation 2 5.8% 4 6.7%
Business/Finance

(includes company profiles)

2 5.8% 2 3.3%
Politics 1 2.9% 4 6.7%
Technology (IT) 1 2.9% 1 1.6%
Sports 1 2.9% 1 1.6%
Education 1 2.9% 1 1.6%
Science 1 2.9% 0 0%
Architecture 0 0% 2 3.3%
Unknown 0 0% 3 5%
Other 0 0% 2 3.3%
Total 34 99.5% 60 99.7%

Table 1: Subjects of translated articles listed on Wikipedia:Pages needing translation into English (May 15, 2011-May 15, 2013)

As we can see from this table, the majority of the translations from French and Spanish listed on Wikipedia:Pages needing translation into English are biographies—of politicians, musicians, actors, engineers, doctors, architects, and even serial killers. While some of these biographies are of historical figures, most are of living people. The arts—articles about TV shows, bands, museums, and fashion—were also a popular topic for translation. In the translations from Spanish, articles about cities or towns in Colombia, Ecuador, Spain, Venezuela and Mexico (grouped here under the label “geography”) were also frequent. So it seems that the interests of those who have been translating articles from French and Spanish as part of this initiative have focused on arts, culture and politics rather than specialized topics such as science, technology, law and medicine. That may explain why articles are also visibly associated with French- and Spanish-speaking regions, demonstrated by the next two tables.

I created these two tables by consulting each of the 94 articles, except in cases where the article had been deleted and no equivalent article could be found in the Spanish or French Wikipedias (marked “unknown” in the tables), and I identified the main country associated with the topic. A biography about a French citizen, for instance, was counted as “France”, as were articles about French subway systems, cities and institutions. Every article was associated with just one country. Thus, when a biography was about someone who was born in one country but lived and worked primarily in another, I labelled the article as being about the country where that person had spent the most time. For instance, http://en.wikipedia.org/wiki/Manuel_Valls was born in Barcelona, but became a French citizen over thirty years ago and is a politician in France’s Parti socialiste, so this article was labelled “France.”

Country/Region Number of articles
France 24
Belgium 2
Cameroon 1
Algeria 2
Canada 1
Western Europe 1
Switzerland 1
Romania 1
n/a 1
Total: 34

Table 2: Primary country associated with translations from French Wikipedia

 

Country Number of articles
Spain 13
Mexico 10
Colombia 10
Argentina 7
Chile 3
Venezuela 3
Peru 2
Ecuador 2
Nicaragua 1
Guatemala 1
Uruguay 1
United States 1
Cuba 1
n/a 1
Unknown 4
Total 60

Table 3: Primary country associated with translations from Spanish Wikipedia

Interestingly, these two tables demonstrate a marked contrast in the geographic spread of the articles: more than 75% of the the French source articles dealt with one country (France), while 75% of the Spanish source articles dealt with three (Spain, Colombia and Mexico), with nearly equal representation for each country. The two tables do, however, demonstrate that the vast majority of articles had strong ties to either French or Spanish-speaking countries: only two exceptions (marked as “n/a” in the tables) did not have a specific link to a country where French or Spanish is an official language.

I think it’s important to keep in mind, though, that even though the French/Spanish translations in Wikipedia:Pages needing translation into English seem to focus on biographies, arts and politics from France, Colombia, Spain and Mexico, translation in Wikipedia as a whole might have other focuses. Topics might differ for other language pairs, and they might also differ in other translation initiatives within Wikipedia and its sister projects (Wikinews, Wiktionary, Wikibooks, etc.). For instance the WikiProject:Medicine Translation Task Force aims to translate medical articles from English Wikipedia into as many other languages as possible, while the Category: Articles needing translation from French Wikipedia page lists over 9,000 articles that could be expanded with content from French Wikipedia, on topics ranging from French military units, government, history and politics to geographic locations and biographies.

I’ll have more details about these translations in the coming weeks. If you have specific questions you’d like to read about, please let me know and I’ll try to find the answers.

ACFAS Conference

I’ve just returned from Quebec City, where I was attending the 81st Congress of the Association francophone pour le savoir (ACFAS), which took place at the Université Laval this year. It was the first time I’d been to an ACFAS event, which, for those of you who might not know, is similar to the Congress of the Humanities and Social Sciences in that a number of conferences from different disciplines take place there, each organized by a different group of scholars. Unlike the Congress of the Humanities and Social Sciences, which is held at universities across Canada and is bilingual, ACFAS is usually hosted by Quebec universities and takes place entirely in French.

This year, three translation-related conferences were taking place at ACFAS, and I was able to attend two of them: La formation aux professions langagières : nouvelles tendances (Training Language Professionals: New Trends), which took place on Wednesday, and La traduction comme frontière (Translation as Borders), which took place Thursday and Friday. Unfortunately, I had to miss the third conference, Langues et technologies : chercheurs, praticiens et gestionnaires se donnent rendez-vous , (Languages and Technologies: A Meeting of Researchers, Practitioners and Managers), because it was taking place at the same time as the conference on translation as borders, where I was presenting a paper. But here are a few points I found interesting and useful at the two conferences I did manage to attend:

La formation aux professions langagières: Nouvelles tendances
This conference gave me a lot of practical ideas to integrate into my courses next year. For instance, I really enjoyed the presentation by Mathieu Leblanc, who carried out an ethonographic study at three Language Service Providers (one public and two private) several years ago. These three LSPs each had at least 35 employees, including new and experienced translators, and he spent one month at each one, conducting interviews and observing workplace practices. (Mathieu presented some of the data from this study at the CATS conference last year. I wrote about it in this post). Although his research goal had been to study translator attitudes toward tools like Translation Memories, the data he gathered during his fieldwork also allowed him to explore questions like “What do translators think about university training programs?” He noted that although both novice and experienced translators noted that university training was good overall, some areas could still be improved: students could be better prepared to meet the productivity demands they will encounter at the workplace, taught not to rely so extensively on tools like Translation Memories, and encouraged to be more critical of sources and translations.

The presentation by Université de Sherbrooke doctoral candidate Fouad El-Karnichi, focused on converting traditional courses to online environments, and I learned that other universities are using a variety of platforms to offer real-time translation courses online. At Glendon, we’ve adopted Adobe Connect for the Master of Conference Interpreting, but the Université du Québec à Trois-Rivières, is using Via for their new online BA in translation. I’ll have to take a look at it to see how it works. Fouad has just posted a few of his own thoughts on the ACFAS conference. You can read them on his blog here.

Finally, Éric Poirier, from the Université du Québec à Trois-Rivières, described a number of activities that could be integrated into a translation course to help familiarize students with online documentary resources like dictionaries, corpora, and concordancers. Here are a few of the activities I found interesting:

  • Have students use a corpus to find collocations for a base word (e.g. Winter + ~cold = harsh)
  • Have students read one of the language columns in Language Update and then translate the word that’s been discussed
  • Have students practice using dictionaries to distinguish between paronyms like affect and effect

In an online course, these kinds of activities could be integrated into the course website via an online form or a quiz that needs to be completed.

Other presentations were very interesting as well, but this post is getting a little long, and I also wanted to discuss some of the talks from the second conference.

La traduction comme frontière
Although several presenters cancelled their talks on the first day, we still had some very stimulating discussions about translation as borders, whether these borders are real, imagined, pragmatic, semantic, political, ideological or something else entirely. Two papers were particularly thought-provoking (at least to me): Chantal Gagnon, from the Université de Montréal, spoke about Canadian Throne Speeches since 1970, with particular emphasis on the words “Canada”, “Canadien/canadien” and “Canadian” in these speeches. The fact that the number of occurrences of these words in English and French differed was not really surprising, since Chantal had found similar differences in other Canadian speeches, but the fact that the 2011 Throne Speech under Prime Minister Harper differed from the others was very intriguing. Finally, Alvaro Echeverri, also from the Université de Montréal, raised some very illuminating questions about the limits of translation, particularly with respect to how we might define the term translation. Based on work by Maria Tymoczko, he proposed studying the corpus of texts before trying to determine what should be considered a translation: that way, researchers will know what kinds of translations/adaptations/inspirations to include.

So all in all, these three days in Quebec City were very stimulating, and I’m anxious to incorporate some of these ideas into my courses next year and my research this summer.