Main types of machine translation errors. as a scientific discipline

58. Retsker Ya.I. On regular correspondences when translating into native language// Theory and methods of educational translation. - M., 1950. S.

59. Retsker Ya.I. Translation theory and translation practice. - M., 1974

60. Semenova M.Yu. Fundamentals of text translation. - M., 2009

61. Strelkovskiy G.M. Theory and practice of military translation. - M., 1979

62. Ter-Minasova S. G. Language and intercultural communication. – M.. 2000

63. Tyulenev S.V. Translation theory. - M.: Gardariki, 2004

64. Fedorov A.V. Fundamentals of the general theory of translation. - SPB., 2002

65. Florin S., Vlakhov S. Untranslatable in translation. - M., 1980

66. Khaleeva I.I. Fundamentals of the theory of teaching the understanding of foreign speech (training of translators). - M., 1989

67. Chomsky N. Language and thinking. - M., 1972

68. Chernov G.V. Theory and practice of simultaneous translation. - M., 1978

69. Chuzhakin A., Palazhchenko P. The World of Translation, or the Eternal Search for Mutual Understanding. M., 1997 - 2004

70. Chukovsky K.I. High art. - M., 1968.

71. Schweitzer A.D. Translation theory (status, problems, aspects). M., 1998

72. Schweitzer A. D. Translation and Linguistics. - M., 1973.

73. Shiryaev A.F. Translation as an object of complex scientific study // Linguistic problems of translation. - M., 1981.

74. Shiryaev A.F. Handbook for simultaneous translation. - M., 1982.

75. Shiryaev A.F. Translation theory. Status. Problems, aspects. M., 1988

76. Shadrin V.I. Theory soon in modern translation studies // Proceedings of the XXVІІІ interuniversity scientific and methodological conference of teachers and graduate students. Issue. 5. Actual problems of the theory and practice of translation. March 15-22, 1999 St. Petersburg, 1999.

Alys shetelde shyққan adebietter:

1. Bassnett-McGuire S. Translation Studies. - Methuen, L. and N.-Y., 1980.

2. Brislin R.W.(ed). translation. Application and Research. - N.-Y., 1976.

3. Brower R. (ed). On Translation. - Cambridge (Mass.), 1959.

5. Catford J. A Linguistic Theory of Translation. - L., 1965.

6. Coseriu E. Teoria del lenguaje in lingliistica general. Cinco estudias. - Madrid, 1973.

7. Dolet E. De la manie're de bien traduire d'une langue en l'autre. – P., 1540

8. Delisle J. L'analyse du discourse comme methode de traduction. – Ottawa, 1984

9. Dryden J. Ovid's Episteles // W.P. Ker (ed.) Essays of John Dryden. - Oxford, 1926.

10. Firth J.R. Linguistic Analysis and Translation // For Roman Jakobson. - The Hague, 1956.

11. Goethe J.W. Drei Stucke vom Übersetzen // H.J. Storig (Hrsg.). Das Problem des Obersetzen. - Stuttgart, 1963.

12. Gutt E.-A. Translation and Relevance. Cognition and Context. - Cambridge (Mass.), 1991.

13. Guttinger F. Zielsprache. Theorie und Technik des Ubersetzens. - Zürich, 1963.

14. Halliday M.A.K. Comparison and Translation // M.A.K. Halliday, A. McIntosh, P. Strevens. The Linguistic Sciences and Language Teaching. - L., 1964.

15. Halliday M.A.K. The comparison of languages ​​// A.Mclntosh, M.A.K.

16. Halliday. Patterns of Language. L., 1966.

17. Holz-Mänttäri J. Transtatorisches Handeln. theory and method. - Helsinki, 1964.

18. Humboldt, W. Einleitung zu "Agamemnon" // H.J. Störig (Hrsg.). Das Problem des Obersetzens. - Stuttgart, 1963.

19. Jäger G. Translation und Translationslinguistik. - Halle (Saale), 1975.

20. Jäger G., Müller D. Kommunikative und maximale Äquivalenz von Texten

// Äquivalenz bei der Translation. - Leipzig, 1982.

21. Jäger G. Die sprachlichen Bedeutungen - das zentrale Problem bei der Translation und ihrer wissenschaftlichen Beschreibung // Bedeuting und Translation. Leipzig, 1986.

22. Jakobson R. On Linguistic Aspects of Translation // R. Brower (ed.). On Translation. - Cambridge (Mass.), 1959.

23. Kade O. Zufall und Gezetzmassigkeit in der Übersetzung. - Leipzig, 1968.

24. Kade O. Die Sprachmittling als gesellschaftliche Erscheinung und Gegenstand wissenschaftlicher Untersuchung. - Leipzig, 1980.

25. Kelletat A.F. Die Ruckschritte der Obersetzungstheorie. - Vaasa, 1986.

26. Koller W. Einfürung in die Übersetzungswissenschaft // 6. Auflage. Wiebelsheim, 2001

27. Krings H.P. Was in den Kopfen von Übersetzern vorgeht. - Tübingen, 1986.

28. Lederer, M. La traduction simultanee. Experiance et theorie. - Paris, 1981.

29. Luther M. Sendbrief vom Dolmetschen // H.J. Störig (Hrsg.). Das Problem des Obersetzens. - Stuttgart, 1963.

30. Lyudskanov A. Prevezhdat chovekt and machine. - Sofia, 1957.

31. Mounin C. Les problems theoriques de la traduction. - Paris, 1963.

32. Mounin G. Teoria et storia, della traduzione. - Turin, 1965.

33 Mounin G. Linguistique et traduction. - Bruxelles, 1976.

34. Neubert A. Pragmatische Aspekte der Übersetzung // A. Neubert (Hrsg.).

Grundfragen der Übersetzungswissenschaft. - Leipzig, 1968.

35 Neubert A. Text and Translation. - Leipzig, 1985.

37. Nida E. Principles of Translation as Exemplified by Bible Translating. // R.Broper (ed.) On Translation. - Cambridge (Mass.), 1959.

38. Nida E. Toward a science of translating. - Leiden, 1964

39. Nida E., Taber C.R. The Theory and Practice of Translation. - Leiden, 1964.

40. Nida E, Reyburn W.D. Meaning Across Cultures. - N-Y., 1976.

41 Quine W. Meaning and Translation. // R. Brower (ed.) On Translation. - Cambridge (Mass.), 1959.

42. Reiss K. Möglichkeiten und Grenzen der Übersetzungskritik. - Munich, 1971.

43. Reiss K.. Vermeer H.J. Grundlegung einer allgermeinen Translationstheorie. Tubingen, 1984.

44 Rose M.D. (ed.). Translation Spectrum. Essays in Theory and Practice. - Albany, 1981.

45 Ross C.D. Translation and Similarity // M.D. Rose (ed.) Translation Spectrum. - Albany, 1981.

46. ​​Savory T. The Art of Translation. - L., 1952.

47. Schleiermacher F. Methoden des Übersetzens // H.J. Störig. Das Problem des Obersetzens. - Stuttgart, 1963.

48. Seleskovich D., Lederer M. Interpreter pour traduire. - Paris., 1987.

49. Snell-Hornby M. Translation Studies. An Integrated Approach. - Amsterdam, Philadelphia, 1988.

50. Sperber D., Wilson D. Relevance: Communication and cognition. - Oxford, 1986.

51. Tirkkonen-Condit S. Textual Criteria in Translation Quality Assessment, - Jyvaskyla, 1982.

52. Toury G. In Search of a Theory of Translation. - TelAviv, 1980.

53. Tytler A.F. Essay on the principles of translation. - L., 1791.

54. Vehmas-Lehto I. Quasi-Correctness. - Helsinki, 1989.

55. Vinay J.-P., Darbelnet J. Stilistique comparée du français et de 1 "anglais. - Paris, 1968.

56. Voegelin C.F. Multiple Stage Translation // IJAL. - Vol.20, No.4, 1954.

57. Wilss W. Übersetzungswissenschaft: Probleme und Methoden. - Stuttgart, 1978.

58. Wilss W. Kognition und Übersetzen: Zu Theorie und Praxis der menschlichen und der maschinellen Übersetzung. - Tüsbingen, 1988.

59. Shuttleworth M., Cowie M. Dictionary of Translation Studies. - Manchester, UK: St. Jerome Publishing, 1997. - 234 p.

Paidalangan sozdikter, resmi kuzhattar, okulyktar, audarylgan enbekter:

1. Ayapova Zh.M., Arynov E.M. Isker adamnyn oryssha-kazaksha economicalyk tusindirme sozdigі. A., 1993

2. Egeubay A. Kisilik kitaby. - Almaty, 1998.

3. Ibatov A. Kutbtyn "Khusrau ua Shirіn" poemsynyn sozdigі (XIV ғasyr) .- Almaty, 1974.

4. Ibraeva A. Zan terminderinіn kazaksha-oryssha zhane oryssha-kazaksha kyskasha tusindirme sozdigі. A., 1996

5. Kazakhstan Respublikasynyn Constitutions. A., 1995

6. Kazakhstan Respublikasynyn Constitutionynyn tusindirme sozdigі. A., 1996

7. Kaliyev G. Tіl bіlіmi terminderіnіn tүsіndіrme сөzdіgі. A., 2005.

8. Kaliev G., Bolganbaev A. Kazirgі kazak tilinin lexicology and phraseology. A., 2006

9. Kozhakhmetova H.K., t.b. Kazakhsha-oryssha phraseologylyқ sozdik. A., 1988

10. Rakhmatullin Kh. A., 1992

11. Rustemov L.Z. Arab-Iran kirme sozderinin kazaksha-oryssha tusindirme sozdigi. A., 1989

12. Russian-Kazakh proverbs and sayings / Comp. A. Turehanov and others - A., 1999

13. Sapargaliev G. Zaң terminderinіn tүsіndіrme sozdіgі. A., 1995

14. Tazkire-i-Bұғra khan - room Serikbai Kosan. - Almaty: Tolagay, 2007.

Paidalangan derekkozder:

1. Abay. Shygarmalarynyn ekі tomdyk tolyk zhinagy. Ekinshi vol. Olender men audarmalar. Poemalar. Kara created. A., 1995

2. Altynsarin Ybyray. Eki tomdyk shygarmalar zhinagy. A., 1988

4. Auezov M. Way of Abai / Translated by A. Kim. - A., 2007. Book 1.

5. Auezov M. Zhiyrma tomdyk shygarmalar zhinagy. A., vol.XII, 1983, vol.XIX, 1985.

6. Auezov Mukhtar. Abai Zholy: An epic novel. A., 1989.

7. Jambul. Selected works. A., 1981

8. Esenberlin I. Almas kylysh (tarihi novel). A., 1971

  • DE-1. Fundamentals of the structure and properties of materials. Phase transformations.
  • DE-2. Fundamentals of heat treatment and surface hardening of alloys
  • Lecture 13. Machine translation of text. Computer language dictionaries

    Fundamentals of machine translation

    Translation- This is a type of language mediation, which is focused on a foreign language original. Translation is considered as a foreign language form of existence of the message contained in the original. Interlingual communication, carried out through translation, to the greatest extent reproduces the process of direct verbal communication, in which communicants use the same language.

    Machine translate is the automatic extraction of knowledge and texts written in natural language using computer programs based on linguistic support.

    Machine translation process are the actions of a computer to convert text in one natural language into a text equivalent in content in another language, as well as the result of such an action.

    Automatic text comprehension system, comes from the fact that the text in natural language, built in accordance with the dictionaries, grammar and algorithms of natural language, based on the semantic network, frames and thesauri, is understood by the user due to the fact that he has linguistic knowledge - syntactic-semantic structures, and as well as specialized knowledge.

    Most automatic language processing systems aim to analyze texts that are pre-divided into sentences. However, language data is available to us most often in the form of texts marked up into paragraphs, chapters, and other larger units. Therefore, appropriate segmentation algorithms are needed for their effective automatic analysis.

    Tasks in creating an automatic text comprehension system:

    analysis of the original natural text, which provides the construction of linguistic structures, including various semantic structures, complete, partial, compressed, tending to present the content of the text in the form databases,

    comparison of the linguistic structures of the text with special or individual knowledge, also presented in the form of a database

    · generalization of texts based on information contained in traditional relational databases, as well as in conceptual text structures or in individual databases.

    For machine translation entered into the computer special program, realizing translation algorithm, which is understood as a sequence of unambiguously and strictly defined actions on the text to find translation correspondences in a given pair of languages ​​for a given direction of translation (from one specific language to another). There are also stand-alone machine translation systems designed to translate within three or more languages, but they are currently experimental.

    Modern machine or automatic translation is carried out with the help of a person: a pre-editor who pre-processes the text to be translated in one way or another, an inter-editor who participates in the translation process, or a post-editor who corrects errors and shortcomings in the text translated by machine .

    The machine translation system includes bilingual dictionaries supplied with the necessary grammatical information (morphological, syntactic and semantic), which provide the transfer of equivalent, variant and transformational translation correspondences, explanatory and special thematic dictionaries, as well as algorithmic tools for grammatical analysis that implement any of the formal grammars adopted for automatic text processing.

    The most common is the following sequence of formal operations, providing analysis and synthesis in the machine translation system:

    text input and search for input word forms in the input dictionary with accompanying morphological analysis, during which the belonging of a given word form to a certain lexeme is established. In the process of analysis, information related to other levels of the organization can also be obtained from the form of the word. language system.

    · translation of idiomatic phrases, phraseological units or stamps of a given subject area, determination of the main grammatical characteristics of the elements of the input text, resolution of homography, lexical analysis and translation of lexemes. Usually, at this stage, single-valued words are separated from polysemantic ones, after which single-valued words are translated according to lists of equivalents, and the so-called contextological dictionaries are used to translate polysemantic words, the dictionary entries of which are algorithms for querying the context for the presence / absence of contextual value determinants.

    · Final grammatical analysis, during which the necessary grammatical information is redefined, taking into account the data of the target language.

    · Synthesis of output word forms and sentences in general in the target language.

    Analysis and synthesis can be carried out both by phrase and for the entire text entered into the computer's memory; in the latter case, the translation algorithm provides for the definition of so-called anaphoric links.

    Machine translation quality depends on:

    The volume of dictionaries

    The amount of information attributed lexical items,

    the thoroughness of compiling and testing the work of analysis and synthesis algorithms,

    software efficiency.

    However, none of the programs can yet expect a “correct”, literary translation of a text consisting of complex phrases.

    Modern hardware and software allow the use of large dictionaries containing detailed grammatical information. Information can be presented as declarative(descriptive) and procedural(taking into account the needs of the algorithm) form.

    Improvement of machine translation programs associated with the concept of soft text comprehension, according to which different users extract their information and their individual meaning from the same text. Model of soft text comprehension consists in the ability to generate various meaningful interpretations of the original object, depending on different conditions and components of its perception.

    More private "machine means" to help the translator and editor - these are automatic dictionaries and terminological databases, computer thesauri, screen editing tools, systems for spelling, terminological and grammatical correction of texts.

    Modern machine translation should be distinguished from the use of computers to assist the human translator. In the latter case, it means automatic dictionary, which helps a person to quickly select the desired translation equivalent. Although in both cases the computer works together with a person (translator or editor), the content of the term "machine translation" includes the idea that the machine takes the main, most of the work of translating and finding translation equivalents and translation correspondences on themselves, leaving the person only control and correction of errors.

    computer dictionary to help a person - this is an auxiliary tool for quickly finding translated matches; At the same time, in such dictionaries, to a limited extent, some functions inherent in machine translation systems can also be implemented.

    V information technology differ 2 main approaches to machine translation:

    superficial familiarization with the content of a document in an unfamiliar language

    The use of machine translation instead of the usual "human" one. This involves careful editing and customization of the translation system for a particular subject area.

    The completeness of the dictionary, its focus on the content and set of language means of translated texts, the effectiveness of methods for resolving lexical polysemy, the effectiveness of algorithms for extracting grammatical information, finding translation correspondences and synthesis algorithms play a role here.

    As a type of language activity translation affects all levels of the language - from the recognition of graphemes (and phonemes in the translation of oral speech) to the transfer of the meaning of the utterance and text. In addition, machine translation provides an opportunity to test theoretical hypotheses about the structure of certain language levels and the effectiveness of the proposed algorithms.

    Need for improvement machine translation is constantly increasing, as it is the most important condition for ensuring interlingual communication, the volume of which is increasing every year.

    Other ways to overcome language barriers to communication - development or adoption common language, as well as the study of foreign languages ​​- cannot be compared with translation in terms of efficiency.

    date of birth machine translation as a research area is usually considered March 1947; it was then that the cryptographer Warren Weaver, in his letter to Norbert Wiener, first posed the problem of machine translation, comparing it with the problem of decryption.

    deals with machine translation problems computational linguistics, which was born in January 1954, when the world's first public experiment on machine translation was held at Georgetown University (USA). At the same time, under the leadership of the largest mathematician and cybernetics Alexei Lyapunov, active work on machine translation and in Moscow. At the beginning of 1956, the first domestic system of machine translation from French into Russian began to work at the Institute of Applied Mathematics (IPM) named after M.V. Keldysh.

    The leaders among modern machine translation programs in Russia are the PROMT system (developed by PROMT, www.e-promt.ru) and the SOCRAT system (developed by Arsenal, www.ars.ru).

    V latest version PROMT has a fundamentally new functionality, "Associated Memory". The mechanism of "Associated Memory" allows you to train the system. With its help, you can save the translation of the text that satisfies you in the knowledge base and later use its fragments when translating similar texts.

    SOCRATES tries to find an unambiguous solution and does not give variance in terms: a word that is not in the dictionary remains in the original spelling. PROMT usually offers several options for translating words and phrases.

    Machine translation, which has gone through several stages of its development, is currently focused on the idea of ​​modeling the actions of a human translator. The translation process is very difficult, and the correct use of the advantages of the software largely determines the quality of the translation. Modern machine translation systems include many additional dictionaries. Based on the features of architectural solutions for linguistic algorithms, systems are divided into two types - "Transfer" and "Interlingua". Automatic translation programs are built in accordance with this division. So, for example, the program "Socrates" translates much better than, say, "Magic Goody" because the linguistic support of the first program is much stronger, and the dictionaries are much larger in volume.

    Machine translation results always have to be edited. So, for example, the "Pars" program provides for the function of additional connection of dictionaries of various subjects. After all, the quality of the translation produced by the machine also depends on the quality of the software. But even fine-tuning the system to the vocabulary of the translated text does not take into account all its features, so translated words that have several synonyms are marked with an asterisk, or are given in brackets as an option.

    Internet technologies gave a new development to machine translation, helped bring it to a new stage of development. Machine translation is effective remedy to view and search for information in a foreign language, namely this function is the main one when working on the Internet. Current state machine translation allows you to get a relatively correct text translation of web pages from most languages. And although fully automatic high-quality translation is not possible, there is already software which facilitates the translation process itself.

    As a result of customization to the subject area and integration with other document processing programs, machine translation allows you to automate the receipt of the translated text.

    Main problem of all machine translation programs is the correct choice of a thematic dictionary, as well as building auxiliary dictionaries.

    Translation partly depends on the level of the user's knowledge (knowledge of the language, skills in working with programs, a sense of the language), as well as, to a greater extent, his ability to work correctly with a text editor, auxiliary utilities, dictionaries and phraseological references. Variants of translations made with the connection of thematic dictionaries, gives good translation, the correct choice of the meaning of the word and the use of phrases in the text. This is due to the fact that the machine adjusts its dictionary to select those synonyms that would be more relevant to the topic. incoming language, and would translate in accordance with the subject of the target language.

    2 approaches to the problem of machine translation development:

    Installation for use universal language meaning, direct approach to translation, transformation of the original text into the translated text

    setting to an intermediate language, modeling human language proficiency

    The problem is that the meaning of a natural language text depends not only on the sentence itself, but also on the context, which is associated with the ambiguity of words and syntactic constructions, the practical impossibility of a global description of the semantic structure of the world even in a limited subject area, and the lack of effective formal methods. descriptions of linguistic patterns.

    Unresolved issues machine translation is

    resolving the ambiguity of the formal parsing isolated text sentences

    overcoming the structural and semantic incompleteness of sections (fragments) of the text

    organization of flexible connection of different subject areas

    the need to understand the text as a whole education

    Machine translation programs are better at processing scientific, technical and educational texts, which are characterized by a strict presentation of the material.

    The colloquial and journalistic style, where there are many specific turns, but most of the words are used in the literal sense, are suitable for introductory translation, but manual editing is required to obtain a competent output text. The resulting translation is a kind of introductory text, where only the general thematic orientation of the text is transmitted.

    Translation fiction and poetry does not meet the requirements of the machine. The meaning of a text built on allegorical expressions is distorted during machine translation and is not even available for review. The machine does not understand ambiguity, which in turn leads to misinterpretation of the translated text, which turns into nonsense.

    You can try to minimize such misunderstandings by observing following rules:

    Correctly build thematic dictionaries

    check the original text at the stage of pre-preparation of its translation

    edit at the final stage of translation

    proper use of dictionary programs

    good knowledge of grammar and vocabulary, as well as topics source code

    proper operation vocabulary, cliches and word forms

    timely replenish special dictionaries with new terms

    How does the translator program work?

    It is based on the translation algorithm - a sequence of uniquely and strictly defined actions on the text to find matches in a given pair of languages ​​L1 - L2 for a given direction of translation (from one specific language to another). Regular dictionaries and grammars different languages are not applicable to machine translation, since they describe the meanings of words and grammatical patterns in a non-strict form, which is in no way acceptable for “machine” use. Therefore, a formal grammar of the language is needed, i.e. logically consistent and clearly expressed (without any implied and omissions). As soon as formal descriptions of various areas of the language began to appear - primarily morphology and syntax - progress was made in the development of automatic translation systems. To work successfully, the machine translation system includes, firstly, bilingual dictionaries provided with the necessary information (morphological, related to word forms, syntactic, describing how words can be combined in a sentence, and semantic, i.e. responsible for the meaning), and secondly, the means of grammatical analysis, which are based on one of the formal, i.e. strict, grammar. The most common is the following sequence of formal operations that provide analysis and synthesis in a machine translation system.

    • 1. At the first stage, text is entered and the input word forms are searched (words in a specific grammatical form, for example, the dative case plural) in the input with accompanying morphological analysis, during which the belonging of a given word form to a certain lexeme (a word as a dictionary unit) is established. In the process of analysis, information related to other levels of organization of the language system can also be obtained from the form of the word, for example, which member of the sentence can be given word. For a machine, the combination of these two operations - both grammatical analysis and appeal to the meaning of words - is a difficult task. It is better to make syntactic analysis independent of the meaning of words, and use the dictionary at other stages of translation.
    • 2. What is an independent syntactic analysis, you can understand if you try to parse the phrase, from which the meanings of specific words are “removed”. A brilliant example of a phrase of this kind is coined by Academician L.V. A shrewd suggestion: The glistening kuzdra bobbed up the bokra and curls the bokra. Senseless phrase? As if yes: in Russian there are no words of which it consists (except for the union and). And yet, to some extent, we understand it.
    • 3. That is, the machine performs syntactic analysis of the sentence without relying on the meanings of its constituent words, using information only about their grammatical properties. As a result of syntactic analysis, a syntactic structure arises, which is depicted as a dependency tree: the “root” is the predicate, and the “branches” are its syntactic relations with dependent words. Each word of the sentence is written in its dictionary form, and with it those grammatical characteristics, which this word has in the analyzed sentence.
    • 4. 2. The next stage includes the translation of idiomatic phrases, phraseological units or stamps of this subject area (for example, when English-Russian translation phrases like in case of, in accordance with receive a single digital equivalent and are excluded from further grammatical analysis); determination of the main grammatical (morphological, syntactic, semantic and lexical) characteristics of the elements of the input text (for example, the number of nouns, verb tense, their role in this sentence, etc.), produced within the input language; disambiguation (say, English round can be a noun, adjective, adverb, verb, or preposition); analysis and translation of words. Usually, at this stage, single-valued words are separated from polysemantic ones (having more than one translation equivalent in the target language), after which single-valued words are translated according to lists of equivalents, and so-called contextological dictionaries are used to translate polysemantic words, the dictionary entries of which are algorithms for querying the context in presence/absence of context value determinants.
    • 5. 3. The final grammatical analysis, during which the necessary grammatical information is additionally determined, taking into account the data of the output language.
    • 6. 4. Synthesis of output word forms and sentences in general in the target language. Here it will not be possible to do with a simple translation of the "nodes" of the tree into another language. The syntax of each language is arranged in its own way: what is the subject in a Russian sentence can (or should) be expressed by an object in another language, and the object, on the contrary, must be transformed into a subject; what in one language is denoted by a group of words is translated into another with just one word, and so on. This transition from structure to real sentence is called syntactic synthesis.
    • 7. Depending on the features of the morphology, syntax and semantics of a particular language pair, as well as the direction of translation general algorithm translation may include other steps, as well as modifications of these steps or their order, but variations of this kind in modern systems, as a rule, are insignificant. Context analysis is used to solve the problem of word ambiguity. The fact is that each of the several meanings of a polysemantic word in most cases is realized in its own set of contexts. That is, each of the "competing" (under interpretation) meanings has its own set of contexts. And it is precisely this dependence of meaning on the environment that allows the listener to understand the statement correctly. For a correct understanding of the statement, it is also necessary to fully take into account the rules for the conditionality of the chosen meaning by the lexical environment (operating in the “phraseological” interpretation of the word), the rules for the conditionality of the chosen meaning by the semantic context (the so-called laws of semantic agreement) and the rules for the conditionality of the chosen meaning by the grammatical (morphological-syntactic) context.
    • 8. Existing machine translation systems tend to target specific language pairs (for example, French and Russian or Japanese and English) and tend to use translation matches either at a superficial level or at some intermediate level between the input and output languages. The quality of machine translation depends on the volume of the dictionary, the amount of information attributed to lexical units, on the thoroughness of compiling and checking the operation of analysis and synthesis algorithms, and on the effectiveness of the software. Modern hardware and software allow the use of large dictionaries containing detailed grammatical information. Information can be presented both in declarative (descriptive) and procedural (taking into account the needs of the algorithm) form.
    • 9. In translation practice and in information technology, there are two main approaches to machine translation. On the one hand, the results of machine translation can be used for superficial acquaintance with the content of a document in an unfamiliar language. In this case, it can be used as signal information and does not require careful editing. Another approach involves the use of machine translation instead of the usual "human". This involves careful editing and customization of the translation system for a particular subject area. The completeness of the dictionary, its focus on the content and set of language means of translated texts, the effectiveness of methods for resolving lexical polysemy, the effectiveness of algorithms for extracting grammatical information, finding translation correspondences and synthesis algorithms play a role here. In practice, translation of this type becomes economically viable if the volume of translated texts is large enough (at least several tens of thousands of pages per year), if the texts are sufficiently homogeneous, the system dictionaries are complete and allow for further expansion, and the software is convenient for post-editing.

    Machine, or rather, computer translation is also written translation, because as a result we get a written text. However, it is not carried out by a translator, but by a special computer program. Modern computer translation programs are quite advanced, but they still cannot solve the most difficult task of the translation process: the choice of a contextually necessary option, which in each text is due to many reasons. Currently, the result of this type of translation can be used as a draft version of the future text, which will be edited by the translator, as well as a means to get a general idea of ​​the topic and content of the text even in the extreme situation of the absence of a translator.

    An even more difficult task is the translation of oral text using computer programs, since the problem of oral speech recognition is only at the initial stage of its solution. Until now, an insurmountable obstacle is the individual coloring of the sound of a segment of speech - in any language, such speech is poorly formalized.

    Syntactic structure pre-editing can be:

    · splitting an extra-long sentence (more than 40 words) into several shorter ones, while adding (if necessary) linking elements;

    · introduction to English text articles where necessary or grammatically justified;

    · repetition of elements in the coordinative connection of phrases in a sentence;

    · the introduction of unions when using an allied connection between sentences;

    · elimination of structures in brackets in the middle of a noun phrase or in the middle of a sentence;

    · replacing occasional abbreviations with full names or introducing special characters that do not allow their translation;

    · elimination of lexical and logical ellipses, informal constructions and metaphors;

    · bringing to a single form constructions or compound words that can be found in the text in continuous, hyphenated and free spelling.

    The manually edited text is then automatically processed in the MP system.

    25. General scheme of machine translation.

    All over the world, the use of machine translation systems, despite all their weaknesses, has long been an element of the professional work of a translator who must be able to use a computer not only as a typewriter. The concept of a translator's workstation, which includes a complex of resident dictionaries, thesauri, spell checking systems, systems for accessing information over various data transmission networks, should become commonplace for a specialist philologist.

    A machine translation (MT) system of texts can be used as part of such a translator's workstation, while providing a high-quality translation that is strictly focused on a specific subject area, user tasks and type of documentation. In addition, such a system can help a user who does not know a foreign language very quickly and at low cost to obtain an approximate (rough) translation of texts in the field of knowledge of interest to him, a translation sufficient to understand the information transmitted by a text in a foreign language.

    General requirements for practical systems

    machine translation (MT)

    · System stability. The MT system should give a result that can be used even in the case of defects in the source material and incomplete vocabulary.

    · System replicability. The system should have fairly simple software and linguistic tools to expand the scope of its application.

    · System adaptability. The MP system should have the means of customization to the needs of specific users and the features of the processed documents.

    · Timing Optimality. The speed of translation of texts must correspond either to the volume of information received per unit of time, or to the norms of users' work.

    · User comfort. Service tools of the system should ensure the convenience of the user in all possible modes of operation in the system.

    When working with a particular machine translation system, it should be remembered that translation is carried out at several subordinate levels of the system implementation.

    These levels generally include:

    · level of automatic text pre-editing;

    · level of lexical and morphological analysis;

    · the level of contextual analysis and group analysis;

    · level of analysis of functional segments;

    · level of proposal analysis;

    · the level of output text synthesis;

    · automatic post-editing level.

    Approaches to machine translation

    Machine translation systems can use a translation method based on linguistic rules. The most suitable words from the source language are simply replaced with words from the target language.

    It is often argued that in order to successfully solve the problem of machine translation, it is necessary to solve the problem of understanding text in natural language.

    As a rule, the rule-based translation method uses a symbolic representation (intermediary), on the basis of which the text in the target language is created. And if we take into account the nature of the intermediary, then we can talk about interlinguistic machine translation or transfer machine translation. These methods require very large dictionaries with morphological, syntactic and semantic information and a large set of rules.

    If the machine translation system has enough data, a good quality translation can be obtained. The main difficulty lies in the formation of these data. For example, large corpora of text needed for statistical methods translation, for translation based on grammar, are insufficient. Moreover, for the latter, an additional task of grammar is required.

    For the translation of related languages ​​(Russian, Ukrainian), a simple replacement of words may be sufficient.

    Modern machine translation systems are divided into three large groups:

    rule-based

    based on examples

    SMP rules based

    Rule-based machine translation systems is a general term that refers to machine translation systems based on linguistic information about the source and target languages.

    They consist of bilingual dictionaries and grammars, covering the main semantic, morphological, syntactic patterns of each language. This approach to machine translation is also called classical.

    Based on these data, the source text is sequentially, by sentences, converted into the translated text. Often, such systems are contrasted with machine translation systems that are based on examples.

    The principle of operation of such systems is the connection between the structure of the input and output sentences. The translation is not of particularly good quality. But it works for simple examples.

    Translation from English to German will look like this:

    A girl eats an apple. Ein Madchen isst einen Apfel.

    These systems fall into three groups:

    word-for-word translation systems;

    · transfer systems;

    interlinguistic;

    Word translation

    Such systems are now used extremely rarely due to the poor quality of translation. The words of the source text are converted (as is) to the words of the translated text. Often such a transformation occurs without lemmatization and morphological analysis. This is the simplest machine translation method. It is used to translate long lists of words (eg directories). It can also be used to compose interlinear for TM-systems.

    Transfer systems

    Like transfer systems, and interlinguistic, have the same general idea. For translation, it is necessary to have an intermediary that carries the meaning of the translated expression. In interlinguistic systems, the mediator does not depend on a pair of languages, while in transfer systems it does.

    Transfer systems work on a very simple principle: rules are applied to the input text that match the structures of the source and target languages. The initial stage of work includes morphological, syntactic (and sometimes semantic) analysis of the text to create an internal representation. The translation is generated from this representation using bilingual dictionaries and grammar rules. Sometimes, based on the primary representation, which was obtained from the source text, a more "abstract" internal representation is built. This is done in order to emphasize places important for translation, and to discard non-essential parts of the text. When constructing the translation text, the transformation of the levels of internal representations occurs in the reverse order.

    When using this strategy, it is enough high quality translations, with an accuracy in the region of 90% (although this is highly dependent on the language pair). The operation of any transfer system consists of at least five parts:

    morphological analysis;

    lexical transfer;

    structural transfer;

    morphological generation.

    Morphological analysis. The words of the source text are classified by parts of speech. Reveal them morphological features. Word lemmas are defined.

    Lexical categorizations. In any text, some words may have more than one meaning, causing ambiguity in the analysis. Lexical categorization reveals the context of a word. Various notes and clarifications are possible.

    Lexical transfer. On the basis of a bilingual dictionary, the lemmas of words are translated. The action is very similar to word-for-word translation.

    structural transfer. The words agree in the sentence.

    Morphological generation. Based on the output data of the structural transfer, word forms of the translated text are created.

    One of the main features of transcendent machine translation systems is the stage during which the intermediate representation of the text in the source language is "transferred" to the intermediate representation of the text in the target language. This can work on one of the two levels of linguistic analysis, or both.

    1. Surface (syntactic) transfer. This level is characterized by the transfer of "syntactic structures" between the source and target languages. Suitable for languages ​​in the same family or the same type, such as in Romance languages, between Italian Spanish, Catalan, French, etc.

    2. Deep (semantic) transfer. The level is characterized by a semantic representation. It depends on the original language. This representation may consist of a number of structures that represent a value. Translation also usually requires a structural transfer. This level is used for translation between more distant languages.

    Interlinguistic machine translation

    Interlinguistic machine translation is one of the classic approaches to machine translation. original the text is transformed into an abstract representation that is independent of the language (unlike transfer translation). The translated text is generated based on this representation. The main advantage of this approach is that to add a new language to the system. It can be proved mathematically that within the framework of this approach, the creation of each new language interpreter for such a system will reduce its cost, compared, for example, with a transfer translation system. Moreover, this approach can

    · to implement "retelling of the text", paraphrasing of the source text within one language;

    · a relatively simple implementation of the translation of very different languages, such as, for example, Russian and Arabic.

    However, there are still no implementations of this approach that would work correctly for at least two languages. Many experts express doubts about the possibility of such an implementation. The biggest difficulty in creating such systems lies in the design of the cross-language representation. It should be both abstract and independent of specific languages, but at the same time it should reflect the features of any existing language. On the other hand, within the framework of artificial intelligence, the task of extracting the meaning of the text has not yet been solved at the moment.

    The interlinguistic approach was first proposed in the 17th century by Descartes and Leibniz, who proposed universal dictionaries using numerical codes. Others such as Cave Beck, Athanasius Kircher and Johann Joachim Becher worked to develop an unambiguous universal language based on the principles of logic and iconography.

    In 1668, John Wilkins, in his treatise An Essay on Genuine Symbolism and Philosophical Language, spoke of his Interlingua.

    In the 18th and 19th centuries, many universal languages ​​were developed, including Esperanto. It is known that the idea of ​​a universal language for machine translation did not manifest itself in any way on early stages development of this technology. Instead, only pairs of languages ​​were considered. However, during the 1950s and 60s, researchers in Cambridge led by Margaret Masterman, in Leningrad led by Nikolai Andreev and in Milan Silvio Ceccato began work in this area.

    In the 1970s and 1980s, some progress was made in this area and a number of machine translation systems were built.

    In this translation method, interlingual representation can be thought of as a way of describing the analysis of a text, in the original language. At the same time, the morphological, syntactic characteristics of the text are preserved in the representation. It is assumed that in this way it is possible to convey the "meaning" when creating a translated text.

    In this case, two interlingual representations are sometimes used. One of them more reflects the characteristics of the source language. The other is the target language. The translation in this case is carried out in two stages.

    In some cases, two or more representations of the same level are used (equally close to both languages), but differing in subject matter. This is necessary to improve the quality of translation of specific texts.

    This approach is not new to linguistics. It is based on the idea of ​​the proximity of languages. To improve the quality of translation, natural language is used as a bridge between two other languages. For example, when translating from Ukrainian into English, Russian is sometimes used.

    To use the interlinguistic machine translation system, you need:

    Dictionaries for analysis and generation of texts;

    description of grammars of languages;

    knowledge base of concepts (to create an interlingual representation);

    · Concept projection rules for languages ​​and representations.

    by the most difficult moment when creating this type is the inability to build a base for broad areas of knowledge. And those databases that are created for a very specific topic have a high computational complexity.