CS/ENGRI 172, Fall 2003: Computation, Information, and Intelligence 11 ...

nt color=blue>Help for Webmasters « back to results for ""
Below is a cache of http://www.cs.cornell.edu/courses/CS172/2003fa/handouts/L35.pdf. It's a snapshot of the page taken as our search engine crawled the Web.
The web site itself may have changed. You can check the current page or check for previous versions at the Internet Archive. Yahoo! is not affiliated with the authors of this page or responsible for its content.
CS/ENGRI 172, Fall 2003: Computation, Information, and Intelligence 11/21/03: Machine Translation
CS/ENGRI 172, Fall 2003: Computation, Information, and Intelligence
11/21/03: Machine Translation
Machine Translation Paradigms
Machine translation (MT) is the automatic translation from a source language (SL) to a target
language (TL), while preserving the meaning of the source text within the target text. There are
three styles of approach to this problem: direct replacement, syntactic transfer, and interlingua.
They vary in their exibility and the amount of computational resources required.
Statistical Machine Translation and the IBM Candide System
Assume word-for-word translations, with no insertions or deletions of words permitted between
source and target sentences.
1
We compute translation probabilities using an auxiliary source of
information: alignments in sentence pairs made up of mutual translations.
The algorithm description uses the following notation:
tr(
s t) the transition weight (or probability) that source-language word s should be translated
as target-language word
t.
p
(1)
, p
(2)
, . . ., p
(N)
the source/target-language sentence pairs in the training corpus
p
(i)
= (
s
(i)
1
s
(i)
2
. . . s
(i)
l
i
;
t
(i)
1
t
(i)
2
. . . t
(i)
l
i
) the
ith sentence pair, where s
(i)
1
. . .s
(i)
l
i
is an
l
i
-word source-
language sentence and
t
(i)
1
. . .t
(i)
l
i
is its
l
i
-word translation.
2
(1
j
1
; 2
j
2
;
. . .; l
i
j
l
i
) an alignment, which lists for each of the
l
i
words in the source-
language sentence which word of the target-language sentence it is aligned to.
A
(i)
1
,
A
(i)
2
, . . ., A
(i)
m
i
a set of
m
i
possible alignments associated with sentence pair
p
i
, where
m
i
=
(
l
i
)(
l
i
1)(l
i
2) . . .(2)(1).
Contains(
s t) is the set of alignments A in which source-language word s is aligned with target-
language word
t.
3
freq(
s t; A) is the number of times we have the word s aligned to t in alignment A.
Using the variable
A to stand for an alignment drawn from an arbitrary sentence pair, we say that
every alignment
A has an alignment weight (or probability) awt(A).
1
This is clearly a simplication of full machine translation.
2
Note that the
s
(i)
j
s and
t
(i)
j
s do not have to be distinct. The subscript
i reects that dierent sentence pairs may
have dierent lengths, though we assume that corresponding source and target language sentences have the same
number of words.
3
Notice that Contains(
s t) can include alignments from dierent sentence pairs.
1 An Iterative Learning Algorithm for MT
1. Initialization: For every sentence pair
p
i
, set awt(
A
(i)
1
) =
· · · = awt(A
(i)
m
i
) = 1
/(m
i
).
2. Repeat the following steps in order until no more changes occur:
3.
Update translation weights: For every source/target word pair (
s, t), change tr(s t) to:
A
in Contains
(st)
freq(
s t, A)awt(A)
4.
Psuedo-normalize translation weights: Change each weight tr(
s t) to
tr(
s t)
t
tr(
s t )
where
t ranges over all target language words.
5.
Update alignment weights: For every
A
(i)
k
= (1
j
1
; 2
j
2
;
. . .; l
i
j
l
i
), change awt(
A
(i)
k
)
to
tr(
s
1
t
j
1
)tr(
s
2
t
j
2
)
· · ·tr(s
l
i
t
j
li
).
6.
Psuedo-normalize alignment weights: For every alignment
A
(i)
k
, change awt(
A
(i)
k
) to
awt(
A
(i)
k
)
m
i
q=1
awt(
A
(i)
q
)
Example
Suppose we have two sentence pairs:
p
1
= (
chat bleu, blue cat) and p
2
= (
chat; cat). This yields
three alignments:
A
(1)
1
= (1
1; 2 2) (so chat aligned to blue)
A
(1)
2
= (1
2; 2 1) (so chat aligned to cat)
A
(2)
1
= (1
1) (only one possible choice)
The
algorithm will then compute the following translation and alignment weights. After conver-
gence, the translation weights indicate our learned word-for-word translations.
awt(
A
(1)
1
)
awt(
A
(1)
2
)
awt(
A
(2)
1
)
tr(
chat blue)
tr(
chat cat)
tr(
bleu blue)
tr(
bleu cat)
a. Init
1/2
1/2
1
b. Update-tr
1/2
1/2
1
1/2
3/2
1/2
1/2
c. Pnorm-tr
1/2
1/2
1
1/4
3/4
1/2
1/2
d. Update-awt
1/8
3/8
3/4
1/4
3/4
1/2
1/2
e. Pnorm-awt
1/4
3/4
1
1/4
3/4
1/2
1/2
f. Update-tr
1/4
3/4
1
1/4
7/4
3/4
1/4
g. Pnorm-tr
1/4
3/4
1
1/8
7/8
3/4
1/4
2