InDesign, Text Extraction Order, is this configurable?

Hi,

I was wondering what rule is applied when text is extracted from idml format?

From what I can see, the text get extracted horizontally across the spread, and then vertically down the page. With threaded text strings coming out in one lump. Can someone confirm or indeed if not, explain the order.

The reason I ask, is that for one of my heavy DTP accounts, we are trying to export the SDLXLIFF into order based on pages (not spreads) - however, this isn't possible given that the content can be mixed over two pages. Now, as a follow-up to this - is it possible I can get access to the methodology maybe to create a custom Studio InDesign API that would extract content by PAGE and vertically first?

Any form of information would be great.

James L

Parents
  • Unknown said:
    From what I can see, the text get extracted horizontally across the spread, and then vertically down the page. With threaded text strings coming out in one lump. Can someone confirm or indeed if not, explain the order.

    Hi James,

    A good question and a complicated, but thorough, answer from .

    In IDML files the stories are not stored in the same order that they are shown in InDesign.  For the sake of convenient document reading and translating SDL Trados Studio extracts the stories in a special predefined order.  The order is calculated by starting in the top left corner of a frame and then going from that top left frame in a left-to-right direction down to the bottom left frame.  For example:

    • If several frames are linked for continuation of the text flow, they are all extracted when the top left of these frames is met.
    • If several frames are grouped, all of these frames are extracted first (when the top left one is met).
    • The top left corner of non-rectangular frames and paths are calculated by overbuilding them to create a rectangular form, for example:
    • If a frame has custom shear and rotation settings, like this:

      then the top left corner is calculated as it is without these settings applied:
    • If a frame is rotated using the rotation tool    then new coordinates are calculated:

    • If the frame on the pasteboard is extracted before the frames on a spread. ie. if the first page is the right page and there’s a frame on the pasteboard to the left, it is still considered as part of a spread and that frame is extracted first:

    • Anchored objects are extracted in the order they were originally placed. If you compare before and after applying anchored object’s options you'll find in both cases the order is 1-2-3:


    • Story extraction order does not depend on which layer a frame is on.

    I hope that explains what you're looking for?

    Unknown said:
    Now, as a follow-up to this - is it possible I can get access to the methodology maybe to create a custom Studio InDesign API that would extract content by PAGE and vertically first?

    This is probably a harder question and I think your best bet is to ask this in the Language Developer community.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Reply
  • Unknown said:
    From what I can see, the text get extracted horizontally across the spread, and then vertically down the page. With threaded text strings coming out in one lump. Can someone confirm or indeed if not, explain the order.

    Hi James,

    A good question and a complicated, but thorough, answer from .

    In IDML files the stories are not stored in the same order that they are shown in InDesign.  For the sake of convenient document reading and translating SDL Trados Studio extracts the stories in a special predefined order.  The order is calculated by starting in the top left corner of a frame and then going from that top left frame in a left-to-right direction down to the bottom left frame.  For example:

    • If several frames are linked for continuation of the text flow, they are all extracted when the top left of these frames is met.
    • If several frames are grouped, all of these frames are extracted first (when the top left one is met).
    • The top left corner of non-rectangular frames and paths are calculated by overbuilding them to create a rectangular form, for example:
    • If a frame has custom shear and rotation settings, like this:

      then the top left corner is calculated as it is without these settings applied:
    • If a frame is rotated using the rotation tool    then new coordinates are calculated:

    • If the frame on the pasteboard is extracted before the frames on a spread. ie. if the first page is the right page and there’s a frame on the pasteboard to the left, it is still considered as part of a spread and that frame is extracted first:

    • Anchored objects are extracted in the order they were originally placed. If you compare before and after applying anchored object’s options you'll find in both cases the order is 1-2-3:


    • Story extraction order does not depend on which layer a frame is on.

    I hope that explains what you're looking for?

    Unknown said:
    Now, as a follow-up to this - is it possible I can get access to the methodology maybe to create a custom Studio InDesign API that would extract content by PAGE and vertically first?

    This is probably a harder question and I think your best bet is to ask this in the Language Developer community.

    Regards

    Paul

    Paul Filkin | RWS Group

    ________________________
    Design your own training!

    You've done the courses and still need to go a little further, or still not clear? 
    Tell us what you need in our Community Solutions Hub

Children