Archive for May, 2008

Felix tip: Choosing your own segmentation

May. 31st 2008
The Felix CAT tool

Clicking the right arrow button (or pressing Alt + Right Arrow) will select the next segment for Felix to look up. In MS Word, MS PowerPoint, and TagAssist, this means the next sentence or line of text. For MS Excel, this means the next cell in the worksheet.

Sometimes you may want more fine-grained control of how segements are selected. This is quite simple: just select the text you want to look up, and click the “L” button (or press Alt + L). That will be the segment that Felix looks up. This works in MS Word, MS PowerPoint, and TagAssist.

It’s also fairly easy to extend your lookup segment. This is useful if you want to translate two or more segments/sentences as a single unit. From Word or PowerPoint, press Ctrl + Right Arrow to extend the lookup to the next segment. In TagAssist, the keyboard shortcut is Alt + X. (I plan on making the keyboard shortcuts more consistent in a future version.) Since the Excel interface is cell-based, it’s not possible to extend the lookup from Excel.

In Microsoft Word, you can also control several aspects of segmentation from the preferences. From the Felix menu, select Felix Preferences, then the Segmentation tab.

Felix segmentation preferences for MS Word

Here you can select the “stop” characters (which characters mark the end of a segment), whether to skip segments containing only numbers (useful when translating tables of figures), and whether to skip segments unless they contain Asian characters, or unless they don’t contain Asian characters. (The label says Japanese, but it works for Japanese/Chinese/Korean. This is a UI bug that will be fixed in the next minor release.)

The Felix manual has more information about segmentation for the MS Word and MS PowerPoint interfaces.

Posted by Ryan Ginstrom | in Felix, tips | 1 Comment »

New tool: Jamming2Felix

May. 29th 2008

I’ve just released a new free tool: Jamming2Felix.

Jamming2Felix is a simple utility that converts glossaries from jamming format into Felix format.

I originally developed this application at the request of a jamming user, and am now releasing it to the general community. Enjoy, and let me know if you run into any problems.

Posted by Ryan Ginstrom | in tools | No Comments »

Development roadmap for Felix

May. 25th 2008
The Felix CAT tool

Now that my translation memory system has a new start as Felix, I’ve got a lot of plans for its development. In this post I want to lay out my development roadmap for the next three months. In June, I’m going to be working on a minor release with several minor enhancements. After that, I’m going to be working on two new features in July and August: translation history and network support (described below). I haven’t decided which order to implement them, so if you have a preference please let me know!

Translation history

This feature will be somewhat analogous to the Trados “bilingual file” concept, except that the information will be stored in a separate file. For example, if you’re translating a file called “MyFile.doc”, the translation history file would be “MyFile.doc.fth”.

This will make reviewing translations a lot easier. It will also make it possible to get rid of the “translation” and “review” modes (which I think introduce too much complexity) — instead Fellix will automatically know whether the current segment is a source or a translation, and behave accordingly.

Another benefit of the translation history feature is that it will allow integration with Trados-based workflows. I’ve long anguished over what to do about “bilingual” Trados files. There is demand to support these in Felix, but I absolutely didn’t want to do it the same way in Felix. I think embedding hidden text in your translation that later needs to be “cleaned up” is a horrible, horrible idea and one of the main reasons why I developed Felix. With a translation history feature, however, I could create a filter that translated between “bilingual” files and translation history files.

Network support

This feature will allow multiple translators to use the same memory simultaneously over a network, or over the Internet using a VPN. This eliminates the coordination problem, when two or more translators work on the same project simultaneously, and translator A translates sentence X or term Y one way, and translator B another way because they haven’t seen each other’s translations yet.

Further out

I don’t have anything concrete planned beyond August, but there are a number of things coming down the pipe. One is a new and improved Align Assist program (for “aligning” legacy translations to create translation memories). Alignment tools generally don’t work very well and are a hard problem, but there’s demand for them so I plan to brush up Align Assist when I hit a good spot with Felix.

I also want to have a better tool for translating XML files, and maybe some other formats like .NET resource files (for localization).

As always, if you have specific preferences for development, please let me know.

Posted by Ryan Ginstrom | in Felix | 5 Comments »

Listing on Microsoft Office Marketplace

May. 23rd 2008

I’ve just got Felix listed on Microsoft Office Marketplace. If you’re a Felix user, I’d really appreciate it if you’d go over there and leave a rating.

It was actually fairly easy to get listed, and best of all, it was free! I simply created a landing page according to Microsoft specifications, and filled out their application form. One of the Office Marketplace people (Rashida Smith) contacted me right away, and I was approved and listed within a week. Thanks Rashida, you were very helpful.

I got the idea for applying from Bob Walsh’s Micro ISV – From Vision to Reality. I plan to post a complete review of this book later, but it contains a lot of great information for people starting a small software business.

Posted by Ryan Ginstrom | in marketing | No Comments »

Defining a vision

May. 22nd 2008

Software needs to have a vision. A software program can’t be everything to everyone; you’ve got to decide who your users are and what you want to do for them.

I had a very clear vision for Felix (then TransAssist, then “Translation Assistant”): it would be powerful, simple, and get out of your way when you weren’t using it. It wouldn’t force its users to adapt to it; it would adapt to its users.

After all, that’s why I created Felix in the first place. I had been working with a couple of the major translation memory systems out there, and I was appalled at how hard to use and buggy they were, but most of all at how arrogant they were. They treated the translator like some trained monkey who had to jump through their hoops, rather than a professional knowledge worker who had just shelled out $1,000+ to use their crappy program.

Going Astray

Somewhere along the way, however, that vision started to get clouded. It was my fault for not stepping up and handling the marketing of Felix myself. Instead I left that to a company that understood the art of selling, but not of making software. In sales meetings where I wasn’t present, they were promising this feature or that feature; then they’d come back to me and say we had to hold up the next release yet again in order to implement feature X that Big Company Y said they absolutely required. Not quite coincidentally, that feature was usually on some checklist put out by one of the big players in the market.

Fire and Motion

Joel Spolsky describes this kind of feature war very aptly as Fire and Motion. Your competitors put out a slew of features that are useless for 99% of users, much like how an octopus sprays out an ink cloud. By the time you catch up the octopus is long gone, far ahead of you.

Implicit in my desire to finally take over the marketing of Felix was a determination to get back to my vision: simple, easy-to-use software. When you talk to users of translation memory, a lot of them will tell you that they bought their translation-memory program in order to get more work, or because their clients told them to. The classic captive user base. A captive user base is why most enterprise software sucks.

Select and Concentrate

I don’t want Felix to suck. I don’t want people to use Felix because somebody makes them. I want people to use Felix because it helps them work better, faster, and smarter. Sure, not implementing the feature smorgasbord will lose me some users, for whom feature X is a deal maker and deal breaker. On the other hand, keeping a well defined vision should make the software better for my target users.

That’s not to say that I’m not going to add features. In fact I have a big list of features that I’m working on right now. However, my focus is on features that reduce complexity from the user’s point of view, ideally making the program more powerful in the bargain.

Japanese has a handy expression, 選択と集中 – select and concentrate. The idea is that you pick your spots, and focus your efforts there. Rather than trying to be everything to everyone, you become indispensable to a niche that you select. That’s the direction I plan to take Felix.

Posted by Ryan Ginstrom | in Felix | 1 Comment »

Charge for 100% matches

May. 22nd 2008

Translation memory can certainly improve productivity. Many purchasers of translation are of course aware of this, and want some of these productivity benefits passed on to them. This usually takes the form of offering discounts for sentences/segments already in the translation memory, and sometimes for fuzzy matches.

But this can be taken too far. Some clients will ask not to pay for 100% matches or repetitions at all.

On the face of it, it sounds reasonable — it’s already in the translation memory, so why should they pay for it twice? But there are two problems with this. First, it assumes that inserting the 100% matches/repetitions into the translation involves no work by the translator. This isn’t true; even 100% matches need to be checked for context in each situation.

To give a very simple example, Japanese doesn’t have a capital/lower case distinction. So if the same Japanese sentence is used in a title and the body of a section, you need two different English translations for it. One will be in Title Caps and follow English conventions for titles (e.g. leaving out articles), and one will be in sentence caps and follow normal English grammar conventions.

Also, Japanese very frequently elides the subject and/or object of the sentence, and verbs lack conjugation for person and number. On this basis alone, the same sentence can have many different translations depending on the context. He/she/it/they/we [will] put/puts it/them/him/her/us/the widgets on the list/in the box/over there…

The other problem stems from this dependence on context. Because context is so important to a translation, especially between languages like Japanese and English that are very different syntactically, you’ve got to pay a lot of attention to those 100% matches just to keep up with the context. Furthermore, if you’re using a translation memory created by someone else you have to pay even more attention in order to conform your translation style to the memory; otherwise, you’re liable to end up with some unreadable Frankensteinian hodgepodge of different styles and terminology.

It also follows from the above that simply leaving out the 100% matches, and sending you only the “new” parts, is even worse. You’re then left with a disjointed list of sentences and no idea of how the sentences fit together.

So sure, offer a discount for 100% matches. But think very carefully before offering to insert those 10,000 words of perfect matches for no charge.

Posted by Ryan Ginstrom | in translation memory | 1 Comment »

Translation memory with non-repetitive texts

May. 20th 2008

Often people who translate texts that aren’t very repetitive will wonder if they can really benefit from using translation memory. Of course, since I market my own translation memory program, and use it myself whenever possible, I’m just a tad biased. Even so, I don’t assume that translation memory is the right match for everybody. In this post, I want to explore who can benefit from using it.

At a minimum, your text has to be in electronic format. If the text to be translated is in paper format or a scanned image, it still might be worthwhile to convert it to electronic format (e.g. using OCR). Even if you don’t use translation memory, it will make future searching easier.

Naturally, the more repetitive your text is, the more useful translation memory will be. I’ve seen many cases where a single job would give enough of a productivity boost to more than pay for a Felix license.

But even if the text isn’t very repetitive, there are other benefits of translation memory, assuming your text is in electronic format:

  • Concordance searches
  • Avoid missing entire phrases/sentences in your translation
  • Automatic glossary lookup and management
  • Easier review

Let me go into each of these benefits in detail.

Concordance searches

A concordance search is used to find words or phrases in your translation memory (and their corresponding translation/source). This is useful to find out how you translated a certain term in the past. For example, say you’re dealing with a tricky phrase, and you’re pretty sure you’ve translated it before. You could use a concordance search to find all the places in your translation memory where you’ve translated that phrase in the past. You could then use one of your prior translations, or use it to brainstorm a new one.

Incidentally, Felix allows concordance searches for both source and translation, but some other tools apparently only allow them for the source. To get concordance for a translation, select the text in the Felix memory window, and press Ctrl + Alt + C (Alt + C for source concordance).

Avoid missing entire phrases/sentences in your translation

Dropped phrases, and even entire sentences and paragraphs, are the bane of the translator. Japanese has a rather charming term — 訳漏れ, or “translation leaks” — to refer to this pernicious problem. The problem with translation leaks is that our eyes tend to jump over them when we review our translation. A careful review will catch them, but it would be nice to avoid them in the first place.

Since translation memory is generally used by translating each segment (e.g. sentence) in turn (Felix does this by overwriting the source file), it’s much less likely that you’ll miss out entire sentences or paragraphs. Of course, the problem of missing phrases is still there, especially with very long sentences (or translating several sentences as a single unit). One trick I use to avoid missing phrases is the register glossary entries feature. When I register parts of the source and translation as glossary entries, I can pretty quickly spot when there are missing bits. As an added bonus, I build up my glossary at the same time.

Automatic glossary lookup and management

Here’s an area where you can benefit even if your text doesn’t contain a lot of repetition. By importing your glossaries into your translation memory tool, and creating your own glossaries, you can automatically look up the glossary matches every time you translate a sentence. This is especially useful when your client gives you a massive terminology list that they want you to follow.

Here’s an example of where this feature can help out. I was doing a translation that included a lot of Chinese place names. I’m pretty bad at reading all but the most common of these names, but I found a page on the Internet with the Japanese and English names of all Chinese provinces and many of its cities. I used the handy Internet Explorer feature to dump this data into MS Excel, and added that glossary to Felix from Excel. Then when I translated the document, every place name was looked up for me automatically.

Easier review

With a review mode, it’s much easier to check each translation against its source segment. Felix also performs a glossary lookup, so you can make sure you’re using glossary terms correctly/consistently.

Conclusion

As I’ve described above, there are several benefits of translation memory even if the text to translate isn’t very repetitive. It remains to be seen, however, whether these benefits are worth the cost of a commercial system. That’s something that every individual translator will have to answer for him or herself. Even if your work is mostly non-repetitive, however, I recommend trying out translation memory and seeing if it works for you. Most of the commercial translation memory systems have trial versions, and there are free programs available as well.

Posted by Ryan Ginstrom | in translation memory | No Comments »

Online wordcount tool now supports PDF files

May. 20th 2008

I’ve added support for PDF files to my online wordcount tool. Now the tool can provide word counts for PDF files using the pyPdf library.

Addendum: I’ve replaced pyPdf with another pure-python library, pdfminer, which is much more robust at handling PDF files.

Posted by Ryan Ginstrom | in website | 2 Comments »

Choosing your competition

May. 17th 2008

Eric Sink famously quoted a former Netscape CEO, writing that you should choose competition that’s big and dumb. It seems that I’ve chosen wisely. Hubris and lack of customer focus are great things to have in a competitor.

Posted by Ryan Ginstrom | in marketing | No Comments »

New online tool: Word count utility

May. 16th 2008

Today I launched a new tool on the Felix website: an online word count utility. One of my goals for this site is to make it useful to translators, and word counts are one of the pain points for translators. This is especially true of those of us working into or out of (East) Asian languages, since most of the word-count tools out there don’t give Asian character counts (unlike those given by MS Word).

Basically, instead of luring people to this site, I want to make translators — my target customers — want to come here, and keep coming back, and tell their friends to come here, too.

Posted by Ryan Ginstrom | in website | No Comments »
  • Search

  • Categories

  • Calendar

    May 2008
    M T W T F S S
        Jun »
     1234
    567891011
    12131415161718
    19202122232425
    262728293031  
  • Pages

  • Meta