Dear all,
I've recently found out a language resource free and publicly availble that seems very interesting here:
http://optima.jrc.it/Resources/DCEP-2013/DCEP-extract-README.html2
the Readme page says the process of pair sentences extraction has to be performed with Python, so I tried it (with no results, I'm a new bie with Python).
Is there may be somebody among you who is familiar with that programming language?
This is what the readme page says:
"
Download and extract the alignment information:
wget optima.jrc.it/.../DCEP-DA-LV.tar.bz2
tar jxf DCEP-DA-LV.tar.bz2
Now we download, extract, and run the tool that generates the bicorpus from the above data:
wget optima.jrc.it/.../DCEP-extract-scripts.tar.bz2 tar jxvf DCEP-extract-scripts.tar.bz2 ./src/languagepair.py DA-LV > DA-LV-bisentences.txt "
With Python I've just managed to download the files.
The "tar" command didn't work so I extracted data simply with 7zip.
Basically, I managed to work around steps 1, 2, 3 and 4.
The problem is the final pair-of-sentence extraction on the last step.
In the command prompt, I've tried to run python first, then I typed the command (following the instructions of the readme page):
./src/languagepair.py DA-LV > DA-LV-bisentences.txt
but I receive the following error message:
"
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
./src/languagepair.py EN-IT > EN-IT-bisentences.txt
File "", line 1
./src/languagepair.py EN-IT > EN-IT-bisentences.txt
^
SyntaxError: invalid syntax
"
I tried that command outside python, in the command prompt with a similar result:
"
'.' is not recognized as an internal or external command,
operable program or batch file.
"
I don't know how to get those pair of sentences extracted...
I wrote to the file owners (the EU) but they couldn't help me on that issue.
I also forward the issue on "codeacademy" here (in case you would like to know other information on this issue):
https://discuss.codecademy.com/t/tar-file-extraction/74560/5
Is there anybody among you who can help me on that sentece extraction?
thanks
Davide