How to find reuse metrics relatively quickly?

I am trying to get counts of conrefs, variables, images and topics included a baseline to present some metrics about the amount of reuse (some factors are ignored). My approach is to parse the fields:

FISHVARINUSE
FISHLINKS
FISHFRAGMENTLINKS
FISHIMAGELINKS

For each baseline, I get the logical id and version of each map or topic. The DocumentObj.RetrieveLanguageMetadata endpoint only allows you to specify 1 version. So, I make multiple calls per baseline for each version included in the baseline. This is very slow since I have to do this for thousands of baselines.

Do you know of a faster way to gather this information?

Parents Reply Children
  • Ok. I misunderstood the meaning of the reportitem elements apparently. I am using Baselin25.GetReport but each object element only has logicalid and version number. But, the first reportitem has the ishlngref, which I took to be a link rather than the object itself. Thanks!

    I am currently using blazegraph to store RDF and then I use SPARQL to query the triples. I've also used openrdf's native storage implementation with their API and another approach when trees made more sense than graphs, was to use XQuery so I stored data in basex.

    Many problems I have had to solve working with DITA CMS systems has been accounting for the fact the CMS (SDL or other) stores XML as text. Inventing my own logic to deal with the text has usually looked like more work than using one of the standards based solution like XML or RDF, so I've dealt with that by transferring data out of the CMS into an XML database or graph database in order to work with it.