Find reuse of topics across publications

Hi Developers,

I am trying to get the reuse percentage of topics in a publication.

I am getting all baseline topics and find each topics in all publication baseline in SDL repository and if any item found in other baseline considering as that topic is reused.  for 50-100 topic publication it takes 2-4 hours to get the response. some of the publication have 1000+ topics, so this approach is not useful. Is there any way to find reuse in ISHRemote or any other modules?

 

Thanks

Roopesh

Parents
  • Hi Roopesh,

    I don't know if I understood you correctly, but it sounds as if you are downloading all baselines in the repository to get at the information you are looking for. That for sure will take a lot of time.

    As Dave said, you basically need some form of intermediate storage here. In terms of (hopefully) useful pointers, I am thinking along the lines of:

    • Initially download all baseline reports, store them as [baseline_id].xml in a local directory.
    • Get basic metadata (Baseline 2.5 GetMetaData) for all baselines, including the fields (the first time around you can combine this with step #1):
      • MODIFIED-ON
      • FISHLABELRELEASED

    At the end of this exercise you have all the data available locally, so processing will be fast. Processing the data to yield the reports you need should not be a problem.

    A little while later it is time to update the reports. Baselines that are released will no longer change, so those never have to be downloaded again. So we can cut that chunk out right away - we simply keep what we already have. For all other baselines, download just the baseline metadata. Chances are only a small subset of active baselines have changed since the last iteration, and those are the only ones you need to download again. The metadata is just a few lines of XML per baseline, so that should not take long.

    You'd have to set up a simple tracking mechanism, of course. But all of this could be done using just a local folder on your computer where you donwload the stuff and put your tracker file (JSON or whatever).

    PS: Then there is the question of how you define reuse - I mean how you turn this information into an easy-to-understand format for say C-level managers who are not familiar with modular documentation. There are "Lies, damned lies, and statistics", this is certainly true for reuse reports, too. Oh, a topic for another time...

    HTH

    Joakim

  • Yes, this is really a good Idea. this will save API response time. Thank you for the suggestion, I have to create a script to download all baseline.
Reply Children
No Data