Recommended size for Projects created through the API


When developing an integration with SDL Managed Translation, what are SDL's recommendations regarding Project size and batching? How many files per project? What recommended word volume per project?



  • Hi Fred

    There are no best practices with batch sizes, as it will vary by customer depending on their service level agreements with the translation vendor. For example, if the SLA is 24 hours, then the integration will need to provide many projects, each with a few files, so that those projects can be shared out across multiple translators. If the SLA is 5 days, then a single project with many files will be sufficient to send to one translator. Industy standards say 1500-2000 words translation per translator per day (depending on content type), so you would map that back to what the content is, in what format, and how that could be seperated.

    My advice would be to include a batch mechanism in the integration configuration page, allowing customers to make their own choice depending on the use-case and SLA. How this functions will also depend on the source content system and how it manages content files. For example, a PIM system may give you the ability to set batching based on number of products - thus a user sending 25 products with a batch set to 10, would create 3 files for translation (2 files with 10 products, 1 with 5 products). This scenario also suggests all 3 files go into one project, so again you might want to think about multiple batching mechanisms - one for file content, one for files in a project.
    A document repository would be less flexible, as its unlikely you would want to split word documents up through an integration, more likely build functionality into the UI to allow users to choose 1 or many files at project creation.

    •  There are no technical limitations placed on project size, however there are some practical, human limitations that should be considered, and some scalability implications. A (human) project manager will need to track, manage and consider each project. Large numbers of small projects increase the amount of mental and physical clicking effort involved for that project manager. Translators, too, will find large numbers of small projects less efficient to work with than fewer larger projects. At an API level, the unit of interaction (status checking, etc.) is the project… Each project carries much metadata (name/description/language-info/cost-info/etc.) as well as the target-language-file info. Retrieving information about 1000 single file projects is substantially more expensive than retrieving a single 1000 file project.


    •  Unless the translation process is fully automated (machine translation only, no human touch), there is little to be gained in terms of turn-around-time by sending many small projects – batching requests and sending every ~15 minutes is unlikely to impact return dates, but could significantly improve overall efficiencies. One common concern from customers is that they wish to receive translated files back at the earliest opportunity – earlier versions of Managed Translation allowed file retrieval only for the entire project, and only when all files in the project were at “download” status. This limitation has since been removed, and it is now possible to retrieve individual target-language-files as soon as the file itself is at “download” status, and also to mark those individual files as “completed” – the overall status of the project is no longer a restricting factor. This means project creation can be batched, but retrieving translated files can occur regularly and promptly.


    • When considering batch and frequency… obviously there’s a trade off between prompt submission and efficiency of scale. Different business requirements may tolerate different batching frequencies (we suggest 15 minutes as a starting point – try it out – make sure it’s easily configurable! It may be that 30 minutes or 5 minutes are more appropriate – perhaps even just twice a day…). You may also want to consider batching by content type – e.g. promotions in one batch, reviews in another, press-releases in another… Whatever makes tracking and reporting simpler. Remember that when you speak to your translation team, and you’re referring to a piece of work, you want (them) to be able to find your information quickly. If your question is “when will Monday’s promotions be ready”, a batch for promotions make the answer easy. If the promotions are mixed in a batch with everything else, you might not get such a quick response! (You may also find that your translation process dictates this type of batching anyway – different workflow required, different translation memories, etc.). This about the priority of content – perhaps you batch premium trips separately to budget trips?


    • To put some numbers to this answer, as a starting point for negotiation (nothing is concrete, this is just some rough guidance):
       Each batch should contain no more than 2000 target files
       Each target file should represent no more than 3000 words of translation.