Archive for the 'resources' Category

Tanaka Corpus available in Felix TM and TMX formats

Feb. 6th 2010

I converted the Tanaka Corpus of aligned Japanese and English sentences into Felix translation memory (TM) and TMX formats.

The Tanaka Corpus is a collection of around 150,000 Japanese-English sentence translation pairs, compiled over several years by university students, with later cleanup and correction by Jim Breen and his colleagues.

Download the Felix/TMX versions of the Tanaka Corpus here.

Posted by Ryan Ginstrom | in Felix, resources | No Comments »

Felix glossaries compiled from Wiktionary

Nov. 14th 2008

I’ve just added 1,388 new glossaries from 43 language pairs, compiled from the Wiktionary project.

Go to Felix Wiktionary glossaries page

Wiktionary is a community-contributed dictionary site that is a spin-off of Wikipedia. There are hundreds of langauges on Wiktionary, but I narrowed this down to 43 using this list of the 50 most widely spoken languages in the world.

The glossaries were compiled from a site snapshot taken on November 12, 2008. I scanned through the XML site download, created lists of all translation pairs, and then compiled Felix glossaries from them.

Wiktionary is licensed under the GNU Free Documentation License, and so are the Felix glossaries compiled from it.

Posted by Ryan Ginstrom | in Felix, resources | No Comments »

EDICT dictionary files available as Felix glossaries

Jun. 10th 2008

I’ve converted the EDICT and ENAMDICT dictionary files created by Jim Breen into Felix format. The converted glossary files are available from the Felix Website.

The EDICT file is multilingual (Japanese/English/French/German/Russian), and I’ve converted it into 20 Felix glossaries representing each language combination. Of course, since Japanese is the central language, language pairs that don’t have Japanese as the source or translation language may be less useful.

The ENAMDICT file is a dictionary of proper names. All together the file was humungous, so I broke it into several smaller glossaries by category (personal names, place names, organizations, and so on). Of course, you’re free to load them all up into your Felix glossary window, since the number of glossaries you can have open is only limited by how much memory your computer has.

Posted by Ryan Ginstrom | in resources | 1 Comment »
  • Search

  • Categories

  • Calendar

    March 2010
    M T W T F S S
    « Feb    
    1234567
    891011121314
    15161718192021
    22232425262728
    293031  
  • Pages

  • Meta