Skip to content

Error while APPLYING SP MODEL ON [train] #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mukesh7 opened this issue Apr 8, 2021 · 0 comments
Open

Error while APPLYING SP MODEL ON [train] #1

mukesh7 opened this issue Apr 8, 2021 · 0 comments

Comments

@mukesh7
Copy link

mukesh7 commented Apr 8, 2021

i'm working on sw-en translation. After running build-training-data.sh the train.src looks like this : <2eng> Je, utatupa hii?
While running preprocess.sh getting the below error i.e. not able to read '<' characters as per my understanding.

Traceback (most recent call last):
File "/gpfs-volume/Afro-NMT/scripts/spm-subword.py", line 95, in
main()
File "/gpfs-volume/Afro-NMT/scripts/spm-subword.py", line 79, in main
encode(args.spm_dir, args.in_file, args.src, args.op_file)
File "/gpfs-volume/Afro-NMT/scripts/spm-subword.py", line 36, in encode
print(' '.join(spm_spp.EncodeAsPieces(line.strip())), file=outfile)
UnicodeEncodeError: 'ascii' codec can't encode character '\u2581' in position 0: ordinal not in range(128)

Can you help how it can be resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant