Need help extracting a large number (8-25k) of files from older SDL (11.1)

Hi all from an SDL novice - my group needs to extract a large number of files (up to 25k) out of SDL (and older version, I believe 11.1?) so that it can be placed on an FTP server. We have a tight timeline and budget and don't know of a script or quick way to extract more than a small number at a time (maybe 20 files?) manually.

Does anyone have any experience with this or know of a script that can be used? Any help or input is appreciated. Thanks

Parents Reply Children
  • Hey Susan,

    the report is available in the Web Client > publications folder > select a publication version > click "Reports"". Then you get a dialog where you can specify a resolution for the images and a language. Then click the "Show Report" button. A new window will appear with all objects listed. At the top of the screen, you can then click "Export Publication". This starts a background process and exports all objects in one go onto the file system on the server.

    There are some things to consider:
    - if you need multiple resolutions, you should run the report once for each resolution (XML files will be exported as well, but you can igore these).
    - you need to repeat this per language
    - only the versions specified in the baseline get exported
    - don't make the dummy publication too large as the loading of the report in the browser may take a while
    - by creating a couple of dummy publications you will be able to 'organize' the objects you export in a somewhat meaningful manner

    The above does not require a script as it uses existing functionality and some manual work to create and export the publications.

    Alternative is to create a custom script using the KC APIs. But in that case you also need a way to identify the objects you need.

    Hope this helps.

    Best regards,
    Kurt
  • Thanks so much for your details and time Kurt. I am going to share this with our IA and see what we can do. My initial goal was to find a script or a script how-to for this giant extraction, but maybe this method can avoid this. If ok with you, I will circle back later with either a follow up that yes, this will work, or might see what other info I can get from your team. Thx again
  • Susan,

    Couple things to consider/balance.
    Creating a publication or series of publications that contain 25,000 objects or more could take some considerable time.
    If all the objects are already linked into maps, then you can fairly easily link all the maps into a new master map for a publication.
    If the objects are not linked into maps then you may spend more time getting everything set up in publications/maps then you would creating a script with the API.
    The benefit of a script that is written against the API is greater flexibility to make adjustments around what content you extract and the script is reusable for other sets of content.
    If you are going the route of creating publications and you want to grab more content you are always manually updating the pubs which is a very manual process vs. changing config on the script and executing.
    If you could explain how you are going to identify the content that needs to be extracted, we can better point you in one direction or another.
    For example:
    - Are you grabbing all the content from a root folder and all its the subfolders recursively?
    - Is it all content that has certain metadata set
    - All content created after or before a certain date
    - Any combination of the above.
    These are all scenarios that could be scripted with API.
    If you are going to pick and choose the objects, then there is really no good way to use the API and the publication route is your only option.
    Hope this helps.