
在上一篇文章中无监督机器翻译《An Effective Approach to Unsupervised Machine Translation》说现在的机器翻译模型有部分优秀模型不再使用平行语料也能够完成翻译。



为了构建自己的跨语种的词向量之间的映射操作,首先需要选择一个单语种词向量训练工具(e.g. word2vec or fasttext),然后再用今天的工具vecmap将一种单语种映射为另一种单语种。





python3 map_embeddings.py --supervised TRAIN.DICT SRC.EMB TRG.EMB SRC_MAPPED.EMB TRG_MAPPED.EMB


python3 map_embeddings.py --semi_supervised TRAIN.DICT SRC.EMB TRG.EMB SRC_MAPPED.EMB TRG_MAPPED.EMB


python3 map_embeddings.py --identical SRC.EMB TRG.EMB SRC_MAPPED.EMB TRG_MAPPED.EMB


python3 map_embeddings.py --unsupervised SRC.EMB TRG.EMB SRC_MAPPED.EMB TRG_MAPPED.EMB




In the harsh winter season, the goose-like snow flakes are flying around in the sky. There is a queen sitting in a window in the palace, doing needlework for her daughter, the wind blowing snow flakes into the window, the ebony window sill There are a lot of snowflakes falling on it. She looked up and looked out the window. She did not pay attention, and a needle stuck into her finger. The red blood flowed out of the needle, and three drops of blood fell on the snow on the window. She thoughtfully stared at the red blood drops on the white snow, and looked at the ebony window sill. She said, "I hope my little daughter's skin will be white and red, and it looks like this white snow and The red blood is the same, so gorgeous, so arrogant, the hair looks like the ebony of this window is generally black and bright!"
Her little daughter has grown up, and the little girl is so beautiful that she is beautiful, beautiful and moving. Her skin is really white like snow, ruddy with blood, and black hair like ebony. So the queen gave her a name, called Snow White. But Snow White has not grown up, and her queen mother died.
Soon, the father of the king married another wife. This queen is very beautiful, but she is very proud and conceited. She is very strong and can't stand it if she hears someone is more beautiful than her. She has a mirror, she often goes to the mirror to appreciate herself and asks: "Tell me, mirror, tell me the truth! All the women here are the most beautiful? Tell me who she is?"
The mirror replied: "It's you, queen! You are the most beautiful woman here."
When she heard this, she would smile with satisfaction. But Snow White grew up slowly and became more and more beautiful. By the time she was seven years old, she was more dazzling than the bright spring, more beautiful than the queen. Until one day, when the queen went to ask the mirror as usual, the mirror made the answer: "The queen, you are beautiful and beautiful, but Bai Xuegong is more beautiful than you!"
She heard this, her heart filled with anger and jealousy, and her face became pale. She called a servant and said to him, "Give me Snow White to the big forest. I don't want to see her anymore." The servant took Snow White away. When he was about to kill her in the forest, she cried and begged him not to kill her. Facing the pleading of the pitiful little princess, the servant’s sympathy came to life. He said, “You are a child who loves you, I will not kill you.” In this way, he left her alone. In the forest. When the servant decided not to kill Snow White and left her there, even though he knew that in the uninhabited big forest, she would be torn into pieces by the beast, but thought that he did not have to kill her by hand. He felt that a heavy stone that was pressing on his heart fell.
After the servant left, Snow White was very scared. She was everywhere in the forest, looking for a way out. The beast screamed beside her, but did not hurt her. In the evening, she came to a small house. When she determined that there was no one in the house, she pushed the door and went to rest, because she was really unable to move. As soon as she entered the door, she found that everything in the house was well organized and clean. A table was covered with white cloth with seven small plates, each with a piece of bread and some other food. There were seven glasses filled with wine next to the plate, seven knives. And the fork, etc., and the wall is also discharged with seven small beds. At this time she felt hungry and thirsty, and she did not care who it was. She went up to cut a small piece of bread from each piece of bread and drank a little bit of wine in each glass. After eating and drinking, she felt very tired and wanted to lie down and rest. So she came to the bed and almost tried every single of the seven beds. It was not too long, it was too short. It was not until the seventh bed was tried. She lay down on it and soon fell asleep.




from gensim.models import fasttext
from gensim.models import word2vec
import jieba

def get_zh_embedding():
    sentance = []
    with open('zh.txt', 'r', encoding='utf8') as f:
        line = f.readline().strip()
        while line:
            line = f.readline().strip()

    ## 对句子进行分词分词
    def segment_sen(sen):
        sen_list = []
            sen_list = jieba.lcut(sen)
        return sen_list
    # 将数据变成gensim中 word2wec函数的数据格式
    sens_list = [segment_sen(i) for i in sentance]

    model = word2vec.Word2Vec(sens_list,min_count=1,iter=20)
    model.wv.save_word2vec_format('SRC.EMB', binary=False)

def get_en_embedding():
    sentance = []
    with open('en.txt', 'r', encoding='utf8') as f:
        line = f.readline().strip().lower()
        while line:
            line = f.readline().strip().lower()

    # 将数据变成gensim中 word2wec函数的数据格式
    sens_list = [i.strip().split(' ') for i in sentance]

    model = word2vec.Word2Vec(sens_list, min_count=1, iter=20)
    model.wv.save_word2vec_format('TRG.EMB', binary=False)





python3 map_embeddings.py --unsupervised SRC.EMB TRG.EMB SRC_MAPPED.EMB TRG_MAPPED.EMB





就 on
里 in
王后 queen
白雪公主 snow
白雪公主 white
他 he
在 in
又 also
我 i
漂亮 pretty
一样 same
长 long


python3 eval_translation.py SRC_MAPPED.EMB TRG_MAPPED.EMB -d TEST.DICT


