Machine Translation (стр. 1 из 2)

Open International University

of Human Development “Ukraine”

Faculty of philology and mass communication

Term Paper

On Aspective Translation

“Machine Translation: Past, Present and Future”

Written by Chizhik Alexey

Group PR-21

Checked by Avdeenko V.P.

Kiеv 2005

Contents

1. Preface

2. Machine Translation: The First 40 Years, 1949-1989

3. Machine Translation in 1990s

4. Machine Translation Quality

5. Machine Translation and Internet

6. Machine and Human Translation

7. Concluding remarks

8. Literature used

Preface

Now it is time to analyze what has happened in the 50 years since machine translation began, review the present situation, and speculate on what the future may bring. Progress in the basic processes of computerized translation has not been as striking as developments in computer technology and software. There is still much scope for the improvement of the linguistic quality of machine translation output, which hopefully developments in both rule-based and corpus-based methods can bring. Greater impact on the future machine translation scenario will probably come from the expected huge increase in demand for on-line real-time communication in many languages, where quality may be less important than accessibility and usability.

Machine Translation: The First 40 Years, 1949-1989

About fifty years ago, Warren Weaver, a former director of the division of natural sciences at the Rockefeller Institute (1932-55), wrote his famous memorandum which had launched research on machine translation at first primarily in the United States but before the end of the 1950s throughout the world.

In those early days and for many years afterwards, computers were quite different from those that we have today. They were very expensive machines disposed in large rooms with reinforced flooring and ventilation systems to reduce excess heat. They required a huge number of maintenance engineers and a dedicated staff of operators and programmers. Most of the work was mathematical in fact, either directly for military institutions or for university departments of physics and applied mathematics with strong links to the armed forces. It was perhaps natural in these circumstances that much of the earliest work on machine translation was supported by military or intelligence funds directly or indirectly, and was destined for usage by such organizations – hence the emphasis in the United States on Russian-to-English translation, and in the Soviet Union on English-to-Russian translation.

Although machine translation attracted a great deal of funding in the 1950s and 1960s, particularly when the arms and space races began in earnest after the launch of the first satellite in 1957, and the first space flight by Gagarin in 1961, the results of this period of activity were disappointing. US was even going to close the research after the publication of the shattering ALPAC (Automatic Language Processing Advisory Committee) report (1966) which concluded that the United States had no need of machine translation even if the prospect of reasonable translations were realistic – which then seemed unlikely. The authors of the report had compared unfavourably the quality of the output produced by current systems with the artificially high quality of the first public demonstration of machine translation in 1954 – the Russian-English program developed jointly by IBM and Georgetown University. The linguistic problems encountered by machine translation researchers had proved to be much greater than anticipated, and that progress had been painfully slow. It should be mentioned that just over five years earlier Joshua Bar-Hillel, one of the first enthusiasts for machine translation who had been disabused of his work, had published his critical review of machine translation research in which he had rejected the implicit aim of fully automatic high quality translation (FAHQT). Indeed he provided a proof of its "non-feasibility". The writers of the ALPAC report agreed with this diagnosis and recommended that research on fully automatic systems should stop and that attention should be directed to lower-level aids for translators.

For some years after ALPAC, research continued on a much-reduced financing. By the mid 1970s, some success could be shown: in 1970 the US Air Force began to use the Systran system for Russian-English translations, in 1976 the Canadians began public use of weather reports translated by the Meteo sublanguage machine translation system, and the Commission of the European Communities applied the English-French version of Systran for helping it with its heavy translation burden – which soon was followed by the development of systems for other European languages. In the 1980s, machine translation rose from its post-ALPAC low spirits: activity began again all over the world – most notably in Japan – with new ideas for research (particularly on knowledge-based and interlingua-based systems), new sources of financial support (the European Union, computer companies), and in particular with the appearance of the first commercial machine translation systems on the market.

Initially, however, attention to the renewed activity was still almost focuses on automatic translation with human assistance, both before (pre-editing), during (interactive solution of problems) and after (post-editing) the translation process itself. The development of computer-based aids or tools for use by human translators was still relatively neglected – despite the explicit requests of translators.

Nearly all research activities in the 1980s were devoted to the exploration of methods of linguistic analysis in order to create generation of programs based on traditional rule-based transfer and interlingua (AI-type knowledge bases representing the more innovative tendency). The needs of translators were left to commercial interests: software for terminology management became available and ALPNET produced a series of translator tools during the 1980s – among them it may be noted was an early version of a program "Translation Memory" (a bilingual database).

Machine Translation in 1990s

The real emergence of translator aids came in the early 1990s with the "translator workstation", among them were such programs as "Trados Translator Workbench", "IBM Translation Manager 2", "STAR Transit", "Eurolang Optimizer", which combined sophisticated text processing and publishing software, terminology management and translation memories.

In the early 1990s, research on machine translation was reinforced by the coming of corpus-based methods, especially by the introduction of statistical methods ("IBM Candide") and of example-based translation. Statistical (stochastic) techniques have brought a reliase from the increasingly evident limitations and inadequacies of previous exclusively rule-based (often syntax-oriented) approaches. Problems of disambiguation, refraining from repetition and more idiomatic generation have become more solvable with corpusbased techniques. On their own, statistical methods are no more the answer in contrast to rule-based methods, but there are now prospects of improved output quality which did not seem reachable 15 years ago. As many observers have indicated, the most promising approaches will probably integrate rule-based and corpus-based methods. Even outside research environments integration is already evident: many commercial machine translation systems now incorporate translation memories, and many translation memory systems are being enriched by machine translation methods.

The main feature of the 1990s has been the rapid increase in the use of machine translation and translation tools. The globalization of commerce and information is placing increasing demands upon the provision of translations. It means not only continuing (maybe even accelerating) growth of the use by multinational companies and translation services of systems to assist in the production of good quality documentation in many languages – by the use of machine translation and translation memory systems or by multilingual document authoring systems, or by combinations of both. Until recent times, the production of translations has been seen essentially as a self-contained activity. For large users, the appearance of translation systems has stimulated the integration of translation and documentation (technical writing and publishing) processes. Translation is now seen as one stage in the processes of communication and getting information. Future products for such kind will not be separate independent machine translation systems, translator workstations or translation tools, but multilingual documentation software complexes combining document creation, translation and revision, document archiving, information analysis, restoration and extraction, etc. in order to satisfy the specific needs of companies.

Machine Translation Quality

Despite the prospects for the future, it has to be said that the new approaches of the present have not yet resulted notable improvements in the quality of the raw output by translation systems. These improvements may come in the future, but overall it has to be said that at present the actual translations produced do not represent major advances on those made by the machine translation systems of the 1970s. We still see the same errors: wrong pronouns, wrong prepositions, anomalous syntax, incorrect choice of terms, plurals instead of singulars, wrong tenses, etc. – errors that no human translators would ever commit. Unfortunately, this situation probably won't change in the near future. There is little sign that basic generalpurpose machine translation programs are soon going to show significant advances in translation quality. And I think that if producers of machine translating systems are still to continue sating market with software of low quality (as in present) the whole machine translation industry may be condemned for ever by the general public as producers of essentially poor-quality software, that could possibly cause damaging of the research and development or even its closure.

In order not to be unsubstantiated I would like to present examples of translation by the programs of machine translation which are the most widely distributed in Ukraine – "Promt" and "Magic Gooddy" (same producer), "Pragma", "Socrat" and one web-resource which provides on-line real-time translation. Their work will be presented on the basis of translation of the extract from the British newspaper article:

The Sunday Times:

Egypt has been training British MI5 and MI6 agents in how to combat Islamic terrorists, underlining Cairo’s growing importance to the war against terror and the Middle East peace process.

A senior Middle Eastern military intelligence official revealed last week that British officers had undergone the training as part of a co-operation programme with Egypt that began after the September 11 attacks on America in 2001 and continued until last year.

Details have not been revealed, but it is believed to have included instruction in specialised interrogation techniques and in the terminology used by terrorists, which will enable agents to understand monitored telephone conversations.

Promt XT (Magic Gooddy):

Египет обучил британский MI5 и MI6 агентов при том, как сразиться с Исламскими террористами, подчеркивая важность роста Каира к войне против ужаса{террора} и ближневосточного мирного процесса.

Старшее Ближневосточное военное должностное лицо сведений{интеллекта} показало на прошлой неделе, что британские чиновники{офицеры} подверглись обучению как часть программы сотрудничества с Египтом, который начал после 11 сентября нападения на Америку в 2001 и продолжался до прошлого года.

Детали не были показаны, но это, как полагают, включило инструкцию в специализированные методы допроса и в терминологию, используемую террористами, которые позволят агентам понять проверенные телефонные беседы.

Socrat:

Египет готовил British MI5 и агентов MI6 в как, чтобы бороться террористов Islamic, подчеркивающих Каир растущего значения в войну против террора и мирный процесс Среднего Востока.

Старший Средний Восточный военный чиновник разведки обнаруживал прошлую неделю, которую Британские должностные лица подверглись подготовку как часть программы сотрудничества с Египтом, который начинался после Сентябрьские 11 атак в Америке в 2001 и продолженной до прошлого года.

Детали не обнаружены, но поверено включенная инструкция в specialised методах опроса и в терминологии использованной террористами, которые позволят, чтобы понимать проверенную телефонную беседу.

Pragma:

Египет обучает Британский MI5 и агенты MI6 в том, как бороться с Мусульманскими террористами, подчеркивая Cаiro’s, дорастающий важность война против террора и процесса мира Ближнего Востока.

Ответственный Средний Восточный военный работник сведений, показанный на последней неделе, что Британские офицеры имели undergone обучение, так как часть программы сотрудничества с Египтом, который начался после 11 сентября, нападает на Америке в 2001 и длительный пока прошлый год.

Детали не были показаны, но это, как полагается, включало инструкцию в speciаlised методах допроса и в терминологии, используемой террористами, которые предоставят возможность агентам понять контролируемые телефонные беседы.

Internet machine translator www.tranexp.com:

Египетский вспомогательный глагол для образования сложных времен быть воспитывать Британский MI5 и MI6 действующая сила в как к бой Мусульманский террорист, подчеркивать Пирамида из камней рост значительность к грамматический определенный член война от ужас и грамматический определенный член Середина Восток мир процесс. ВЫСШАЯ ОТМЕТКА ЗА КЛАССНУЮ РАБОТУ старший Середина Восточный военный ум служебный открывать последний неделя тот Британский чиновник вспомогательный глагол для образования сложных времен испытывать грамматический определенный член воспитывать как часть яние) от высшая отметка за классную работу co - действие программа с Египетский тот начинать за грамматический определенный член Сентябрь 11 атаковать на Американский в 2001 и непрерывный до прошлый год. Подробность вспомогательный глагол для образования сложных времен не быть открывать, только он быть верить к вспомогательный глагол для образования сложных времен заключать обучение в специализация вопрос техника и в грамматический определенный член терминология употребление у террорист, который воля давать возможность или право действующая сила к понимать наставник телефон разговор.