Feature #12223

Puppetize machine translation service on translate.lizard

Added by emmapeel 2017-02-13 08:40:05 . Updated 2019-06-27 17:16:30 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
Start date:
2017-02-13
Due date:
% Done:

50%

Feature Branch:
emmapeel:feature/12223-tmserver
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

Weblate uses the translate toolkit to provide real time suggestions for translations from a sqlite database that can be fed with .po files.

https://docs.weblate.org/en/weblate-2.10.1/admin/machine.html#tmserver

This is important because it saves time for the translators, specially in cumbersome documents, and helps us to be consistent not only with our translations but, for example, with the Debian locales if we feed them to the machine.

It is a very subtle way of increasing the quality of our translations.

So, the debian package provides `tmserver` http://docs.translatehouse.org/projects/translate-toolkit/en/latest/commands/tmserver.html

> tmserver is a Translation Memory service that can be queried via HTTP using a simple REST like URL/http and data is exchanged between server and client encoded in JSON.

So, first you feed the database (I have been breeding a little one myself) with
different files:

build_tmdb -d /var/lib/tm/db -s en -t cs locale/cs/LC_MESSAGES/django.po

and then you can run it like this:

tmserver -d /var/lib/tm/db

and query it like this:

http://HOST:PORT/tmserver/SOURCE_LANG/TARGET_LANG/unit/STRING
http://localhost:8080/tmserver/en/de/unit/contribute

I was thinking on maybe creating a systemd.service to run it.

Also, what about the database? shall we add it to the mutable_data dir?
Let me know what you think and then I move on


Files


Subtasks


Related issues

Related to Tails - Feature #15359: List parts of code/packages/configs to be puppetized for translation platform & its clone Resolved 2018-03-02
Blocks Tails - Feature #15180: Backups for the tmserver in translate.lizard Resolved 2018-01-17

History

#1 Updated by emmapeel 2017-02-13 08:42:08

#2 Updated by emmapeel 2017-02-14 07:19:05

  • Description updated

#3 Updated by emmapeel 2017-02-14 07:19:49

  • Description updated

#4 Updated by intrigeri 2017-02-14 13:37:44

  • Assignee changed from intrigeri to emmapeel

> So, first you feed the database (I have been breeding a little one myself) with
> different files:

>

> build_tmdb -d /var/lib/tm/db -s en -t cs locale/cs/LC_MESSAGES/django.po
> 

I guess you can do this yourself. Let me know if you need anything from me. I suspect a dedicated user to run tmserver, and /var/lib/tm that would be owned by the tmserver user. Anything else?

> and then you can run it like this:

>

> tmserver -d /var/lib/tm/db
> 

> I was thinking on maybe creating a systemd.service to run it.

Sure, I can do that once the above point is clarified.

> Also, what about the database? shall we add it to the mutable_data dir?

Given tmserver has a web API, it can run as a dedicated user, so it’s better if its DB is in a dedicated directory.

#5 Updated by intrigeri 2017-02-14 13:38:55

  • Subject changed from Puppetize machine translation service on [translate] to Puppetize machine translation service on translate.lizard
  • Assignee changed from emmapeel to intrigeri

#6 Updated by intrigeri 2017-02-14 13:39:23

  • Tracker changed from Bug to Feature
  • Status changed from New to Confirmed
  • Assignee changed from intrigeri to emmapeel
  • QA Check set to Info Needed

#7 Updated by emmapeel 2018-01-17 12:24:30

  • Assignee changed from emmapeel to intrigeri

intrigeri wrote:
> I guess you can do this yourself. Let me know if you need anything from me. I suspect a dedicated user to run tmserver, and /var/lib/tm that would be owned by the tmserver user. Anything else?
[…]
> Given tmserver has a web API, it can run as a dedicated user, so it’s better if its DB is in a dedicated directory.

Ok, I have pushed some suggestions for the tmserver user and its folder at
git.tails.boum.org:emmapeel/puppet-tails.git

branch: feature-15077-tmservice

But still needs to create the database and run the service at boot. Could you please have a look?

#8 Updated by emmapeel 2018-01-17 13:02:40

  • blocks Feature #15074: Set up and configure the web interface of the translation platform added

#9 Updated by intrigeri 2018-01-17 13:29:01

  • Status changed from Confirmed to In Progress
  • Assignee changed from intrigeri to emmapeel
  • % Done changed from 0 to 10
  • QA Check changed from Info Needed to Dev Needed

WIP, we’re working on it privately: user, group and homedir are managed. Next step: systemd unit file.

#10 Updated by emmapeel 2018-01-17 14:11:13

  • blocked by Feature #15180: Backups for the tmserver in translate.lizard added

#11 Updated by emmapeel 2018-01-17 14:11:24

  • blocks deleted (Feature #15180: Backups for the tmserver in translate.lizard)

#12 Updated by emmapeel 2018-01-17 14:11:49

  • blocks Feature #15180: Backups for the tmserver in translate.lizard added

#13 Updated by emmapeel 2018-01-17 18:33:21

Ok, I have created a database with the command:

build_tmdb -d /var/lib/tmserver/db -s en -t es /var/lib/tmserver/allspanish.po

(and I got the spanish file with all the translations from current directory with the l10n trick
“Build a translation memory to use with Poedit” at:
https://tails.boum.org/contribute/l10n_tricks/#index11h1 )

#14 Updated by emmapeel 2018-01-24 14:56:12

Ok, I am able to run the service now with the command

tmserver -d /var/lib/tmserver/db

#15 Updated by emmapeel 2018-01-24 14:57:58

  • Assignee deleted (emmapeel)
  • % Done changed from 10 to 50

OK, now somebody should create the service.

#16 Updated by Anonymous 2018-03-02 10:59:18

  • related to Feature #15359: List parts of code/packages/configs to be puppetized for translation platform & its clone added

#17 Updated by Anonymous 2018-03-13 14:32:35

  • Assignee set to emmapeel
  • QA Check changed from Dev Needed to Info Needed

Hi emmapeel,

I’m reassigning this to you so you can clarify what is missing here and if you provided all the information intrigeri requested?

#18 Updated by emmapeel 2018-04-03 10:59:33

  • QA Check changed from Info Needed to Dev Needed

The information is ready for the service to be created.

#19 Updated by emmapeel 2018-04-03 10:59:52

  • Assignee changed from emmapeel to intrigeri

The information is ready for the service to be created.

#20 Updated by emmapeel 2018-04-03 11:11:19

  • blocked by deleted (Feature #15074: Set up and configure the web interface of the translation platform)

#21 Updated by intrigeri 2018-04-30 17:40:38

  • Assignee changed from intrigeri to groente

FYI I’m not involved in this project, groente is handling anything Puppet.

#22 Updated by Anonymous 2018-08-17 15:17:51

any news on this groente?

#23 Updated by groente 2018-09-26 16:31:00

  • Assignee changed from groente to emmapeel
  • QA Check changed from Dev Needed to Info Needed

So, tmserver is now puppetised and running, but weblate doesn’t seem to like it very much yet. Any suggestions?

#24 Updated by emmapeel 2018-10-27 10:53:41

  • Assignee changed from emmapeel to drebs

I am not sure what is happening, I can get suggestions from the machine with

wget http://127.0.0.1:8080/tmserver/en/es/unit/Contact

But weblate does not seem to understand. Drebs had found out something in the code, passing it to him

#25 Updated by emmapeel 2018-10-27 11:20:03

  • Assignee changed from drebs to groente
  • QA Check changed from Info Needed to Ready for QA
  • Feature Branch set to emmapeel:feature/12223-tmserver

I got it! it’s the stupid link, it missed a part of the URL.

it is working with the changes proposed at git.tails.boum.org:emmapeel/puppet-tails.git
[feature/12223-tmserver d0eaa4c] correct link. part of Feature #12223
1 file changed, 1 insertion(+), 1 deletion(-)

#26 Updated by emmapeel 2018-10-27 12:41:42

Incidentally, we need to give someone permissions to edit /var/lib/tmserver/db to be able to add more suggestions.

Currently doing

sudo -u weblate build_tmdb -d /var/lib/tmserver/db -s en -t es wiki/src/donate/thanks.es.po

gives a sqlite3.OperationalError: unable to open database file

#27 Updated by groente 2018-10-28 19:45:26

  • Assignee changed from groente to emmapeel

i’ve changed the link in the settings template and loosened the permissions on the /var/lib/tmserver directory so user weblate is now allowed to access the db file. can you verify if everything works now?

#28 Updated by emmapeel 2018-10-29 13:37:19

  • Assignee changed from emmapeel to groente

i was not able to open the database repeating the command above.

#29 Updated by groente 2018-10-29 16:17:23

  • Assignee changed from groente to emmapeel
  • QA Check changed from Ready for QA to Info Needed

Okay, so apparently having access to the db file is not enough, it wants write access to the whole /var/lib/tmserver directory to play around with db-journal files. I see two possible paths:

- invoke posix extended acl’s, which is going to be somewhat painful
- change the group ownership of /var/lib/tmserver to weblate instead of weblate_admin, this has my preference

Is there any actual reason this directory has group weblate_admin? I see some po files there, but they do not look like they need to be in this particular directory.

#30 Updated by emmapeel 2018-11-12 11:46:11

groente wrote:
> Okay, so apparently having access to the db file is not enough, it wants write access to the whole /var/lib/tmserver directory to play around with db-journal files. I see two possible paths:

> - change the group ownership of /var/lib/tmserver to weblate instead of weblate_admin, this has my preference
>
> Is there any actual reason this directory has group weblate_admin? I see some po files there, but they do not look like they need to be in this particular directory.

I think intrigeri wanted the tmservice to be run apart from weblate, to isolate them. But I am not sure.

#31 Updated by emmapeel 2018-11-12 11:46:27

  • Assignee changed from emmapeel to groente

#32 Updated by groente 2018-11-13 09:20:30

  • Assignee changed from groente to emmapeel

well, i changed the group ownership there, that should fix things. can you let me know if the script works now?

#33 Updated by emmapeel 2018-11-13 10:35:43

  • File update_tm.sh added
  • Assignee changed from emmapeel to groente
  • QA Check changed from Info Needed to Ready for QA

It works!

It says it is not finding Python-Levenshtein so it does a slower matching, even when Python-Levenshtein is installed. But it adds the component!

Yay!

I have a small script that gos through the files and adds them to the translation memory, maybe it would be nice to schedule a monthly run of it to update our suggestions. what do you think?

here attached

#34 Updated by groente 2018-11-13 10:58:28

  • Assignee changed from groente to emmapeel

okay, great, script deployed!
is it time to close this ticket now?

#35 Updated by emmapeel 2018-11-13 13:29:41

  • Status changed from In Progress to Resolved
  • Assignee changed from emmapeel to groente
  • QA Check changed from Ready for QA to Pass

yeah! kill it with fire!

#36 Updated by Anonymous 2019-02-07 15:34:55

#37 Updated by intrigeri 2019-06-27 17:16:30

  • Assignee deleted (groente)