Language and Informational Prisons: The Case of Arabic

What language you are born into matters. It matters because it’s a means of communication and it matters even more because it’s a kind of soft prison. I regularly turn off the French language media because I become cumulatively irritated at the number of absurd statements I hear coming out of the mouths of presumably university-educated French newsmen and newswomen. There are fewer absurd affirmations in the news in this English-speaking country simply because good information is more abundant in English than it is in French.

We are used to believing that whoever is intelligent is also well informed. The reverse, we know, is not true. There are plenty of people who accumulate information and who are perfect fools. The best way I have heard it put is from an anonymous author played recently on my local radio station (KSCO Santa Cruz 1080 AM): Being aware of the fact that a tomato is a fruit is to be well-informed; to abstain from putting tomatoes in a fruit salad is to be wise!

The assumption that intelligent people are automatically well informed is so general that when we come across someone who is obviously intelligent but ill-informed we study him like an infinitely interesting creature. I have known several people like that in my life. They drove me crazy. One I know now, is smarter than I, I suspect but nearly everything he believes to be true is false. My friend has made a philosophical decision not to have any electronic media in his house. He usually carries a book. Over time, I have come to suspect that he does not read very well, that he is dyslexic (whatever that means) or something like that. In general, we don’t think enough of this rare case: The ignorant intelligent person.

What brings forth these musings is an op-ed piece in the Wall Street Journal of 2/17/11 by a Donal E. Kochan. (“Reading Adam Smith in Arabic.”)

Mr Kochan delivers a tidbit that places the string of revolutions across the Middle-East in a different perspective. I knew about that tidbit; I had even mentioned it in a scholarly paper but it had slipped my mind. Here it is:

Considering the early 80s, for one million Arabs, five books were translated (from any language). For Hungarians, the corresponding figure is 519, or one hundred times more. Let’s bring these numbers to a scale taking into account the absolute numbers of Hungarian speakers and of Arabic speakers. I do this to consider the informational situation of an Arab high-school graduate (from any country) in comparison to that of a Hungarian speaker. There are about 330,000,000 Arabic speakers (Wikepedia says: 360 million but it is pushing the envelope) and only ten million Hungarian speakers.

With these numbers, each Hungarian high-schooler has potentially access to three times more books of foreign origin than his Arab counterpart. I can hear your comments from here: Here they are:

We don’t know what either Hungarians or Arabs read in translation. It could be mostly romance novels on one side, either side, and mostly treaties on political economy on the other side. I understand this, but the fact is that the more different titles, the more likely it is that some are of serious works. And yes, I agree that it would only take one good study to falsify this reasoning. In the meantime, it’s the best that I have, that you have.

Your second objection is probably the cynical one that high-school students don’t read anyway. I am sure that’s true, and college students don’t do much more on the whole. Yet, both kinds of students are exposed to teachers who have read some, while in college in one case, possibly continuously on the other. In his hap-hazard manner, and across an absurdly large number of years of schooling if you think about it, Americans (and other Westerners and the Japanese, and others) acquire a smattering of history, and a smattering of legal principles. They may even capture the essences of the moral and political foundations of liberty. Most Americans may have never read Thomas Paine but they have heard of him.

It seems to me that the average Arab student is less likely to have have been taught by someone so informed. I know I should not refer to the “average Arab student” because so many things vary across countries where Arabic is the native language. They vary, in particular, in connection with the ability to read in foreign languages. In countries that were deeply colonized, such as Algeria, an instrumental knowledge of one foreign language, French, is practically universal among teachers. In countries where the foreign yoke was light or short-lived, like Egypt, it appears that many members of the middle-class, including teachers, know some English. It’s not clear that this command of English is good enough to be put to the specific use of reading. In other Arab countries that were never administered from Europe, such as Saudi Arabia, you can be almost sure that workable knowledge of any foreign language is scant among high-school teachers.

At any rate, we must consider the Arab world in general as mostly comparatively ill-informed about general ideas that are the foundations of our assumptions in developed, democratic, capitalist countries. I had a brief, recent exposure to this reality.

A couple of weeks ago, I came across a strange item in a Moroccan Facebook I follow regularly for my own reasons. The item was a rumor to the effect that someone in America was printing Korans for the sole purpose of burning them. I tried to interject a more reasonable explanation, namely the growing number of American Muslims and of Islamic schools that require yearly thousands of copies of the Koran. I began my explanation with the remark that I though it was my absolute right to burn any religious scriptures including the Koran. I added that I would not do such a thing because it would be rude (“malpoli”). My purpose was to signal to the Moroccan Facebook readers that one could not count on any American civil authorities to stop the burnings of Korans if any such event did take place.

I expected a torrent or protestations and of insults, possibly some threats. Instead, my message triggered no reaction at all, A couple of days later it had disappeared from the blogosphere. In the meantime, I speculate that my freedom of religion affirmation, the assertion of my right to disrespect religion was so outside the range of the Moroccan Facebook users ‘ understanding of the world that it was literally not understood. Or else, it was treated by all as the statement of an obvious madman. If I am correct, this reaction is especially interesting because the whole conversation was taking place in French and it’s likely that all the participants were young. (I don’t see older Moroccans sitting in Internet cafés communicating with strangers in the un-stylish manner the Internet implies.) Thus, my simple statement was apparently met with incomprehension by what is probably the upper-crust of Moroccan society in terms of familiarity with democratic concepts.

So, I suspect that, to a large extent, what we are witnessing in the Middle-East is not what we are seeing. Rather, it’s something utterly unfamiliar: a democratic revolution by people, many of them intelligent, who may have only a hazy idea of what democracy entails, besides the obvious fact of elections. I am bracing myself for surprises. Yet, I remain sympathetic to the disciplined, astonishingly civilized revolutions of Tunisia and Egypt.

Incidentally, Kochna’s op-ed describes a program of translation of American and other books into Arabic implemented by the State Department through local embassies. In two words: It languishes. Too bad, that is a form of foreign aid that is peaceful and entirely in keeping with the American tradition of enlightenment. If its funding were multiplied by one thousand, its cost would still be paltry. It might even favor favor with the firebrand Libertarians who have embarked on a mission to cancel all foreign aid.

About Jacques Delacroix

I am a sociologist, a short-story writer, and a blogger (Facts Matter and Notes On Liberty) in Santa Cruz, California.
  1. Scott says:

    Jacques. Reread the op Ed. It says 5 books per million arabic speakers. 519 books per million Hungarian speakers. You point still holds but it is an important distinction. It won’t be more government that will get those books teammates, I shudder to rhink what the cost was, even if justified in this case. It will be ever improving translation software. A million mediocre, but acceptable for this purpose, will soon be had for the cost of 5 official government sponsored translations, a major improvement. What we should be doing is giving any Arab dissident we like a box full of paid for satellite phones with Internet access.(such as those available in the marine industry) but I doubt anyone in Washington is do clever.

    • jacquesdelacroix says:

      Scott: My arithmetic holds. The numbers of translated titles available per Arabic reader depends on the total number of books translated , not on the number per anything.
      I wish you would make your point more clear about translation software. I don’t know what you mean. Translation software for obvious languages related to English, such as French and Spanish, are completely awful beyond single words. (If you know something I don’t know, please, tell me.) I have no idea how long it will take before software can translate the Declaration of Independence into Arabic. My guess would be five hundred year or a little less.
      In the meantime, the obstacle if not the difficulty of translation, it’s lack of initiative, lack of interest. There are thousands of Arabic speakers who could translate the Declaration if someone offered them $7 per hour.

    • jacquesdelacroix says:

      PS If anyone is interested in translation quality her is a simple test: Chose a simple text in English. Have the software of your choice translate it into the language of your choice. Then, have it translate it back into English. Then, read.

  2. Scott cochran says:

    Several Points. While your point is still valid, you did omit an important fact, the editorial said “per million”. Yes, far too few books are being translated in Arabic, but probably not to the degree you attest. If there were say 100 times as many Arabic speakers as Hungarian (it is probably not that many) then the aggregate level of translations is roughly the same.
    2) Yes, the current translations are poor. But that is an improvement from horrid a couple of years ago and basically non-existent a couple of years before that. 500 years? Really? You know Moores law, that processing power for a given cost doubles every 18 months, which has proven very accurate for decades. Thus 500 years is 333 cycles or about 2.2046E+100! I refuse to believe that the processing power (and supporting software) needs that level of a quantum jump, a thousand perhaps, not two google.

    Speaking of a Google lets use them as an example.

    I would argue ,Voice recognition software a decade ago was far worse than the current level of translation software. About a decade ago a company I used to call on was selling voice recognition software for Voice mails systems. It is common place now to call many companies and the computer operator for Jacques Delacroix and get the right person. When I called a decade ago it was laughably poor. “Jacques Delacroix” would get you “Ed Jones”, I am not kidding. I suspect its accuracy was less than 10% and from a very finite data set (the number of employees at the company). 4 years ago google launched 100 goog 411 (since discontinued, a pity it was quite successful) where you ask the information operator (a computer) to connect you with any person or business in the country. In 6 years, 4 Moores cycles! With likely 90% accuracy to a massive data set (plus the level of complication to maintain the DB) even over poor quality cell phone lines! Starbucks in Lahs Gahtis (Which is how Los Gatos residents pronounce this town) would get you the right coffee shop. What is next? Google is coming out with an app for android phones that will allow REAL TIME translations. Speak Arabic in the speaker, the receiver will hear Hungarian. (voice recognition plus translations fast enough that there is no delay) Yes the first version will be bad, but it will improve, and to the point I wonder when the hearer wont know if he is speaking to a native speaker or not. 500 years no. 10-15 tops.

    Perhaps we should start the Delacroix Prize? A take off of the Loeber prize:
    The Loebner Prize is an annual competition in artificial intelligence that awards prizes to the chatterbot considered by the judges to be the most human-like. The format of the competition is that of a standard Turing test. In each round, a human judge simultaneously holds textual conversations with a computer program and a human being via computer. Based upon the responses, the judge must decide which is which.
    You will be given 4 recipes from a French cookbook and 1 recipe from an English cookbook that has been translated into French via inexpensive translation software. You need to tell which is the translation (and based on the translation, we need to modify the ingredients to comply with the metric system, pick a French recipe etc.)
    I think that translations for that to be “good enough” will occur in less than 4 years (You will be able to it is a translation but can accurately cook the dish.) Within 10 won’t be able to tell. Within 15 even complicated translations like an editorial in the WSJ will be transparent.

    I would argue that excellent translations to virtually any major language will occur in 10-15 years, “good enough” translations where the jist of what is being talked about comes though albeit a bit clumsily or clunky, in 5. I would also argue that the “good enough” translations are “good enough” for translations form major works into Arabic (You will disagree as you have a rather high standard for language, but I would argue that relative to what they are getting now, nothing, it would be a huge leap forward.) The declaration of independence will have good enough translations in 5 years.

    And the software needs to be cheap if not free. $7 is several days wages for millions of Arabic speakers. (A months worth if it takes several hours). Not wanting to pay that much to read and English document could be forgiven, it is not solely a lack of motivation.

    • jacquesdelacroix says:

      Scott: I am still puzzled about your remarks on my arithmetic. I say that there are three times fewer translated books for the average Arabic speaker than there is for the average Hungarian speaker. Is this correct or not?
      I do this after replicating the assertion in the article I quote because that’s the decent thing to do. What’ the beef?
      Now about your predictions: You give a good example of naive enthusiasm inspired rather than moderated by technical knowledge. First, there is no Moore’s “Law.” There is an empirical observation regarding the limited concept of computing power. That seems to be a long way from what we are talking about, translation. What I know for a fact is that translation from one simple language to another, say, French to English, is today normally abominable when you go beyond two words at a time. I don’t mean that such translations do not correspond to my own sense of style. I mean that they are incomprehensible. Perhaps, you are right and they used to be even more abominable ten years ago. I have trouble disagreeing with you on this because I don’t have a scale of translation abomination. Which inevitably brings up the question: Do you ?(Have scale of translation abomination.) I know this sounds mean and awful but I often marvel a also about the translation thereof.
      With all this, I don’t know that you are wrong. Perhaps, in some year, not yet visible on the temporal horizon, good software will replace translators and even simultaneous interpreters. And as the French say so exquisitely, “If my aunt had two, we would call her ‘Uncle'” Go, ahead, have Google translate this!
      Incidentally, by selecting cooking recipes, you set yourself an easy task, probably unconsciously. The challenge is the first page of the Declaration of Independence.
      Keep us informed and give me the final word on my arithmetic, lease.
      Ps Under the right circumstances, I would be interested in trying out the best translation software and reporting on this blog.

  3. David says:

    I figured I’d take some time to weigh in on this fun topic.

    I don’t think computerized translation will be happening between all languages anytime soon. The main problem with translation is that the words are used to describe concepts and use different linguistic mechanics. And not all similar words have a similar enough concept/mechanics behind them to make translation simple. For example, during the Beijing Olympics, I saw numerous “odd” translations from Chinese to English. (If memory serves me correctly, the translators were human, not computer.) Once such sign was translated as “Slip Carefully,” which, I think was intended to mean “Watch Your Step” or perhaps “Wet Floor.” Unfortunately I only saw the sign, not the surrounding area, so I can’t be sure what the intended meaning was. The point of the sign was to warn people; when translated, however, just caused confusion because the concept/mechanics between Chinese and English. That doesn’t even take into account regional vernacular/alternate meanings.

    I take my next example from Spanish to English translation. Two very well understood languages; probably the two most widely used languages in the world. In English, a common salutation is “Have a good/great/nice day!” In Spanish a common salutation is “Que se vaya(n) bien!” They are both trying to describe the concept of a friendly, well meaning salutation, but direct translation doesn’t work so well. Que se vaya(n) bein equates, grammatically, to “Whatever to you it go well.” That’s a linquistic structural difference that would make it difficult to understand the concept of having a good day. How a concept gets structured can make translation difficult.

    Side note, I went to google translate to see what it would say the english meaning of “Que se vaya bien” is; Google translated it as “that go well.”

    English and Spanish are fairly well grasped languages. Translating between English and Arabic and back again. Or (more entertaining, I would think) between Chinese and Arabic and back again WITH HUMANS who have the capacity of comprehension, strikes me as a monumental task. Trying to input the nuances, comprehension and fluidity of language into a computing device isn’t going to happen for a long while. By the time we get today’s language inputed into a computer translator, language will have evolved beyond it. (Take the word “gay” for example, it keeps having different nuances/meanings as time moves forward. )

    • jacquesdelacroix says:

      Interesting. I suspect Scott was all buffed up because of the recent contest in artificial intelligence where software did surprisingly well against humans. I don’t even want to begin to explain the nature of the difficulties involved in translation with people who are innocent of any foreign language. I don’t want to do it because it would take me straight into the book I am not ready to write yet. I am glad you took a crack at it, though.

      And I am still waiting for the private helicopter in my garage I was promised in 1954!

      • David says:

        I realized the complexity of attempting to describe translation difficulties as I was formulating the concept in my mind. As I was typing it I realized the scope and started editing, with the hope of keeping it brief and clear. Looking back on it I think I did an okay job, considering the brevity in which I wrote it, but to develop the idea properly would require an extremely long essay or, more likely, a book. Like you said, I took a crack at it. 😀

      • jacquesdelacroix says:

        Hi, David. The main problem I have with this problem is a widespread naivety of the American-born about the act of doing anything in a foreign language. They grow up surrounded by people who know their language. They assume from this that it must be fairly easy. When they try, they almost all fail because they were not prepared for the time investment required. For reasons that are mysterious, as far as I am concerned, the experience does not change their view. I suspect there is a whole national cultural worldview behind this particular kind of obstinacy. I suspect it’s part of a broader whole that has many beneficial features. You see how this would quickly become a book.

        As for the statement about failing here is a test I used to give informally to French and Spanish majors, including those graduating he next month:

        How would you say this perfectly ordinary English sentence in (French, Spanish):

        “If I had known it was going to be like this, I would never have come.”

        I don’t know how often I gave the quiz but I did it for thirty years every time the opportunity arose. Pass rate: 0%

  4. David says:

    Watson was basically one big database. The hard part of Watson was getting the language recognition software written. Even then it made some odd mistakes. And that was just trying to get a computer programmed in English to understand English well enough to be a very accurate search engine.

