Posts Tagged ‘Python’

Using the myGengo Translation API with Python

Tuesday, May 31st, 2011

For those who haven’t heard the news, Google has deprecated a slew of their APIs, leaving many developers and services in a bit of a pinch. While there’s admittedly still time for developers to transition, it’s a good time to start considering alternatives. In my opinion, it’s probably a good idea to choose an alternative that has the technology in question as a core competency, otherwise you’re much more liable to have your provider pull the rug out from underneath you.

With that said, many engineers are hit particularly hard by the deprecation of the Translation API that Google has so generously offered up to this point, and desire a solid alternative. While there are other machine translation APIs out there, I wanted to take a moment to show more developers how integrating with the myGengo API can get them the best of both worlds.

A Polite Heads Up
As of May 31, 2011 I am currently working with myGengo to further develop their translation services. However, this post represents my own thoughts and opinions, and in no way represents myGengo as a company. myGengo offers both free machine translation and paid human translation under one API, and is currently offering $25 in free API credits to all new developers interested in trying it out. I simply want to show other developers that this is very easy to use.

Getting Started with the myGengo API

This takes all of 5 minutes to do, but it’s required before you can start getting things translated. A full rundown is available, which includes details on using the API Sandbox for extensive testing. For the code below, we’re going to work on the normal account.

A Basic Example

The myGengo API is pretty simple to use, but the authentication and signing can be annoying to do at first (like many other APIs). To ease this, there’s a few client libraries you can use – the one I advocate using (and to be fair, I also wrote it) is the mygengo-python library, which you just installed. With this it becomes incredibly easy to start making calls and submitting things for translation:

The above script should print out 25.00 if you signed up with myGengo using that link above.

Actually Translating Text

Extending the above bit of code to actually translate some text is very simple – the thing to realize up front is that myGengo works on a system of tiers, with said tiers being machine, standard, pro, and ultra. These dictate the type of translation you’ll get back. Machine translations are the fastest and free, but least accurate; the latter three are all tiers of human translation, and their rates vary accordingly (see the website for current rates).

For the example below, we’re going to just use machine translation, since it’s an effective 1:1 replacement for Google’s APIs. A great feature of the myGengo API is that you can upgrade to a human translation whenever you want; while you’re waiting for a human to translate your job, myGengo still returns the machine translation for any possible intermediary needs.

It’s your responsibility to determine what level you need – if you’re translating something to be published in another country, for instance, human translation will inevitably work better since a native translator understands the cultural aspects that a machine won’t.

This really couldn’t be more straight-forward. We’ve just requested our text be translated from English to Japanese by a machine, and gotten our results instantly. This is only the tip of the iceberg, too – if you have multiple things you need translated, you can actually bundle them all up and post them all at once (see this example in the mygengo-python repository).

Taking it One Step Further!

Remember the “human translation is more accurate” point I noted above? Well, it hasn’t changed in the last paragraph or two, so let’s see how we could integrate this into a web application. The problem with human translation has historically been the human factor itself; it’s slower because it has to pass through a person or two. myGengo has gone a long way in alleviating this pain point, and their API is no exception: you can register a callback url to have a job POSTed back to when it’s been completed by a human translator.

This adds another field or two to the translation API call above, but it’s overall nothing too new:

All we’ve done here is change the level we want, to use a human (standard level), and supplied a callback url to post the job to once it’s completed. As you can see, the response from our submission includes a free machine translation to use in the interim, so you’re not left completely high and dry. You can also specify a comment for the translator (e.g, if there’s some context that should be taken into account).

Now we need a view to handle the job being sent back to us when it’s completed. Being a python-focused article, we’ll use Django as our framework of choice below, but this should be fairly portable to any framework in general. I leave the routing up to the reader, as it’s largely basic Django knowledge anyway:

Now, wasn’t that easy? Human translations with myGengo are pretty fast, and you get the machine translation for free – it makes for a very bulletproof approach if you decide to use it.

Room for Improvement?

mygengo-python is open source and fork-able over on GitHub. I’m the chief maintainer, and love seeing pull requests and ideas for new features. If you think something could be made better (or is lacking completely), don’t hesitate to get in touch!

Emulating Ruby’s “method_missing” in Python

Tuesday, November 2nd, 2010

I don’t pretend to be a huge fan of Ruby. That said, I can respect when a language has a feature that’s pretty damn neat and useful. For the uninformed, method_missing in Ruby is something like the following:

Obviously, this is a trick that should be used with caution. It can make for some unmaintainable code, as a class with many methods could get difficult to trace through and figure out just what the hell is happening. It can be put to good use, though – take an API wrapper, for instance. What’s it consist of? Generally, nothing more than the same function calls made over and over to various service endpoints.

Cool, let’s use this in Python!

I recently rewrote Twython to support OAuth authentication with Twitter (as of Twython 1.3). It ships with an example Django application to get people up and running quickly, and the adoption has been pretty awesome so far.

The funny thing about the Twython 1.3.0 release is that it was largely a rewrite of the entire library. It had become somewhat unwieldy, some odd two thousand lines of code with each API endpoint getting its own method definition. The only differing aspect of these calls is the endpoint URL itself. This is a perfect case for a method_missing setup – let’s catch the calls to non-existent methods, and grab them out of a dictionary mapping to every endpoint.

The source above is fairly well commented, but feel free to ask in the comments if you need further explanation. This resulted in a much more maintainable version of Twython – for each function that’s listed in a hash table, we can now just take any named parameter and url-encode/combine it. This makes Twython pretty API-change agnostic of the entire Twitter API. Pretty awesome sauce, no?