2011-02-13

Thoughts on the "language industry": Size, standards, and some neat web demos

These days I have been reading quite a few things about the "language industry" (yes, this is work-related). It started out with a study on "The size of the language industry", which is available from the European Commission - Directorate-Generale for Translation's inventory of publications.

In this study, the authors attempt to estimate the size (or value) of the European language industry, based on an estimate of the the annual turnover from several industry sectors in 2008. Their reliability of their results is fairly limited, since as they state in their Executive Summary:
"Due to the lack of accurate data as explained later in this summary, the figures derived about the size and the volume of the language industry in Europe are often based on assumtions an must therefore be considered higly speculative." (page iv)

They report the following results (page 20):

Language industry sectorEstimated turnover in 2008 in million Euro
1. Translation and interpreting, software localization, and website globalization5675
2. Language technology tools568
3. Subtitling and dubbing633
4. Languag teaching1579
5. Conference organization143
Total8454 million Euro
Most notably, the figures for the sectors of "2. language technology" and "3. dubbing and subtitling" were each merely estimated at about 10% of the sector of "translation and interpreting" (page 20), and the estimate for sector "4. language teaching" is based on data from only two European countries (page 86), which leaves the resulting figures highly problematic in my opinion.

However, the study is interesting as an entry point, since it reveals a lot about the structure of the language industry pointing to some big players on the scene. I started, more or less by coincidence, to discover their names on the web, and (yes, via twitter), found some interesting articles.

For instance, Arle Lommel, Director of Open Standards at LISA, writes - among other things - about standards in the locilazation and terminology management business:

Through LISA, a number of localization-specific standards were developed. Translation Memory eXchange (TMX), even if it had never quite lived up to some expectations, has made it possible for companies to migrate their TM assets from one tool to another. TBX has done the same for terminology data. Other standards, like SRX and GMX-V, have not received the sort of implementation that would allow them to solve the industry problems they address, but they are starting to be used. XLIFF (developed at OASIS) has found its niche in facilitating interoperability between content creation tools and CAT tools.

Further on, Lommel argues that these standards, however, are addressing needs from the past. He suggests

a fresh assessment of standards. What are the standards we need now? Let’s set TMX, TBX, SRX, XLIFF, and ITS aside for the moment and ask what we’d create today if we were starting from scratch. I suspect none of these standards. They have many good (even brilliant) ideas that we can use, but we need new architecture.

Well, its worth to read to the end. His ideas remind me of some current developments in the world of language resources and technology. Anyway.

Just a list of other neat things I found on the (twitter) way (mostly from @jeromobot if I recall correctly),
Cool stuff, but I have to stop here.

Keine Kommentare:

Kommentar veröffentlichen