Updating object's metadata using Content Importer

We want to add/change metadata of all objects in a publication at once. I wrote a PowerShell script but it takes long time to complete a task since it works as a single process. Then, I thought if it is possible to add/change metadata using Content Importer instead. I mean, imports only .met or .3sish files with Content Importer in order to add/change metadata for a bunch of objects. Is this possible?

p.s. We are using the initial version of TridionDocs14.

Regards,

Naoki

  • Hi Naoki - Looking under the cover ISHRemote and Content Importer use exactly the same public API, so if you see a performance difference it is probably verifying your code with the expected end result. Any chances of sharing PowerShell script?

  • Hi Dave,

    I wrote two PowerShell scripts; getting metadata and setting metadata. I'm happy if you give me any advice to improve the performance of these scripts.

    # Get objects metadata from TD14 UAT server
    # [Command syntax]
    # > GetMetaFromUAT.ps1 <obj_list_csv> <out_meta_tsv> <lang>
    #
    # !!! Important coding note !!!!
    # The <out_meta_tsv> argument must be absolute path.
    # When creating/writing file using System.IO.StreamWriter, the file location must be specified with a full path.
    #

    # Get command line arguments and check them
    Param($obj_list_csv, $out_meta_tsv, $lang)
    if ([string]::IsNullOrEmpty($obj_list_csv) -or [string]::IsNullOrEmpty($out_meta_tsv) -or [string]::IsNullOrEmpty($lang)) {
    Write-Error("Error: Three arguments (obj_list_csv, out_meta_tsv, lang) must be specified.")
    break
    }

    #Provide Web Services URL
    $url = 'https://xxxxxxxxxx/ISHWS/'

    #Provide Domain and your username
    $username = 'xxxxx'
    $password = 'xxxxx'

    # Check for file existence
    if (-not(Test-Path $obj_list_csv)) {
    Write-Error("Error: Specified CSV file does not exist.")
    break
    }

    # login to Content Manager
    try {
    $ishSession = New-IshSession -WsBaseUrl $url -IshUserName $username -IshPassword $password
    } catch {
    Write-Host "Error: Cannot connect to TridionDocs."
    exit
    }

    # Read CSV file that contains GUID and its version list
    $input_csv = Get-Content $obj_list_csv | ConvertFrom-Csv -Header @('GUID', 'version')

    # Create output file and write header line in it
    try {
    $outputFile = New-Object System.IO.StreamWriter($out_meta_tsv, $false, [Text.Encoding]::GetEncoding("UTF-16"))
    $outputFile.WriteLine("GUID`tVersion`tTitle`tModel classification`tModel series`tModel`tRegion`tCountry`tKeywords`tResolution")
    } catch {
    Write-Error "Error: Cannot create the output file."
    }

    # Get matadata for each object
    try {
    $input_csv | ForEach-Object {
    $cur_GUID = $_."GUID"
    $cur_version = $_."version"
    $metadataFilter = Set-IshMetadataFilterField -IshSession $ishSession -Level version -Name "VERSION" -FilterOperator Equal -Value $cur_version | Set-IshMetadataFilterField -IshSession $ishSession -Level lng -Name "DOC-LANGUAGE" -FilterOperator Equal -Value $lang | Set-IshMetadataFilterField -IshSession $ishSession -Level lng -Name "FRESOLUTION" -FilterOperator Equal -Value "Thumbnail"
    $target_obj = Get-IshDocumentObj -IshSession $ishSession -LogicalId $cur_GUID -MetadataFilter $metadataFilter
    $title = ($target_obj.IshField | where Name -eq FTITLE | where ValueType -eq value).value
    $modelClassification = ($target_obj.IshField | where Name -eq KUBOTAMODELCLASSIFICATION | where ValueType -eq value).value
    $modelSeries = ($target_obj.IshField | where Name -eq FKUBOTAMODELSERIES | where ValueType -eq value).value
    $model = ($target_obj.IshField | where Name -eq FKUBOTAMODEL | where ValueType -eq value).value
    $region = ($target_obj.IshField | where Name -eq FKUBOTAREGION | where ValueType -eq value).value
    $country = ($target_obj.IshField | where Name -eq FKUBOTACOUNTRY | where ValueType -eq value).value
    $keyword = ($target_obj.IshField | where Name -eq FKUBOTAKEYWORD | where ValueType -eq value).value
    $resolution = ($target_obj.IshField | where Name -eq FRESOLUTION | where ValueType -eq value).value
    Write-Host $cur_GUID,"`t",$cur_version,"`t",$title,"`t",$modelClassification,"`t",$modelSeries,"`t",$model,"`t",$region,"`t",$country,"`t",$keyword,"`t",$resolution
    $outputFile.WriteLine("$cur_GUID`t$cur_version`t$title`t$modelClassification`t$modelSeries`t$model`t$region`t$country`t$keyword`t$resolution")
    }
    } catch {
    Write-Error "Error: Cannot write data to the output file."
    }

    $outputFile.Close()

    ==========

    # Set objects metadata to TD14 UAT server
    # [Command syntax]
    # > SetMetaToUAT.ps1 <metadata_tsv> <lang>
    #

    # Get command line arguments and check them
    Param($metadata_tsv, $lang)
    if ([string]::IsNullOrEmpty($metadata_tsv) -or [string]::IsNullOrEmpty($lang)) {
    Write-Error("Error: Two arguments (metadata_tsv, lang) must be specified.")
    break
    }
    $metadata_tsv = Resolve-Path($metadata_tsv) # Convert to full path

    #Provide Web Services URL
    $url = 'https://xxxxxxxxxx/ISHWS/'

    #Provide Domain and your username
    $username = 'xxxxx'
    $password = 'xxxxx'

    # Check for file existence
    if (-not(Test-Path $metadata_tsv)) {
    Write-Error("Error: Specified metadata file does not exist.")
    break
    }

    # login to Content Manager
    try {
    $ishSession = New-IshSession -WsBaseUrl $url -IshUserName $username -IshPassword $password
    } catch {
    Write-Error "Error: Cannot connect to TridionDocs."
    exit
    }

    # Read TSV file that contains GUID and its metadata
    $header = "guid`tversion`ttitle`tclassification`tmodelseries`tmodel`tregion`tcountry`tkeywords`tresolution" -split "`t"
    $input_tsv = Get-Content -Encoding Unicode $metadata_tsv | ConvertFrom-Csv -Header $header -Delimiter "`t"

    # Get matadata for each object
    try {
    $input_tsv | ForEach-Object {
    $GUID = $_."GUID"
    if ($GUID -match "^GUID-.*") {
    $version = $_."version"
    $title = $_."Title"
    $modelClassification = $_."Model classification"
    $modelSeries = $_."Model series"
    $model = $_."Model"
    $region = $_."Region"
    $country = $_."Country"
    $keyword = $_."Keywords"
    $resolution = $_."Resolution"
    Write-Host $GUID,"`t",$version,"`t",$title

    $metadataFilter = Set-IshMetadataFilterField -IshSession $ishSession -Level version -Name "VERSION" -FilterOperator Equal -Value $version | Set-IshMetadataFilterField -IshSession $ishSession -Level lng -Name "DOC-LANGUAGE" -FilterOperator Equal -Value $lang | Set-IshMetadataFilterField -IshSession $ishSession -Level lng -Name "FRESOLUTION" -FilterOperator Equal -Value "Thumbnail"
    $target_obj = Get-IshDocumentObj -IshSession $ishSession -LogicalId $GUID -MetadataFilter $metadataFilter

    if ($resolution.Length -eq 0) {
    # This is NOT image object so that it does not have resolution
    Write-Host "... Not image object"
    $ishObjectMetadata = Set-IshMetadataField -Name "FTITLE" -Level logical -ValueType Value -Value $title | Set-IshMetadataField -Name "KUBOTAMODELCLASSIFICATION" -Level version -ValueType Value -Value $modelClassification | Set-IshMetadataField -Name "FKUBOTAMODELSERIES" -Level version -ValueType Value -Value $modelSeries | Set-IshMetadataField -Name "FKUBOTAMODEL" -Level version -ValueType Value -Value $model | Set-IshMetadataField -Name "FKUBOTAREGION" -Level version -ValueType Value -Value $region | Set-IshMetadataField -Name "FKUBOTACOUNTRY" -Level version -ValueType Value -Value $country | Set-IshMetadataField -Name "FKUBOTAKEYWORD" -Level version -ValueType Value -Value $keyword
    } else {
    # This is image object so that it has resolution
    Write-Host "... Image object"
    $ishObjectMetadata = Set-IshMetadataField -Name "FTITLE" -Level logical -ValueType Value -Value $title | Set-IshMetadataField -Name "KUBOTAMODELCLASSIFICATION" -Level logical -ValueType Value -Value $modelClassification | Set-IshMetadataField -Name "FKUBOTAMODELSERIES" -Level logical -ValueType Value -Value $modelSeries | Set-IshMetadataField -Name "FKUBOTAMODEL" -Level logical -ValueType Value -Value $model | Set-IshMetadataField -Name "FKUBOTAREGION" -Level logical -ValueType Value -Value $region | Set-IshMetadataField -Name "FKUBOTACOUNTRY" -Level logical -ValueType Value -Value $country | Set-IshMetadataField -Name "FKUBOTAKEYWORD" -Level logical -ValueType Value -Value $keyword
    }
    $target_obj = Set-IshDocumentObj -IshSession $ishSession -IshObject $target_obj -Metadata $ishObjectMetadata
    }
    }
    } catch {
    Write-Error "Error: Cannot modify metadata of objects."
    }

  • Thanks Naoki, I'm deriving "intention" based of your script. It looks like you get a list of content-object-versions from somewhere (CSV), you read its metadata on one system and then use that metadata to update exactly the same content-object-versions in some other system.

    Allow me to rewrite in pseudocode

    1. Reading metadata for every object in $input_csv, 
      1. Do a single-object Get-IshDocumentObj even filtering on specific version and specific language 
      2. Write a single line in a CSV file
    2. Writing metadata for every object in $input_tsv
      1. Do a single-object Set-IshDocumentObj even filtering on specific version as your (not-anonymized) fields are on logical and version level

    So on the write operation (step 2) there is little you can do, as every update in the CMS needs to become one database transaction. I also don't think you can guarantee that if you could do a batch update that they would each get the same metadata as you are setting up $ishObjectMetadata every time inside the foreach-object loop, so for every CSV line. Outside of ISHRemote you could consider multi-threading but I wonder if that development effort is worth it (probably not in PowerShell).

    On the read operation (step 1), you have more options on getting to much faster group-retrieval

    • Get-IshDocumentObj for multiple GUIDs, using client-side filtering... so retrieving all versions instead of a specific one
    • Depending on how the CSV file is sourced, you could retrieving it differently
    • Do you need the CSV... ISHRemote allows explicit -IshSession parameter, so you can have two sessions active $ishSessionPrimary and $ishSessionSecondary
    • Reading is often faster than writing, so bypassing a write operation could make it faster if you know it didn't change (based on modified-on date or some custom field)

    If I can give some pointers

    1. CSV handling is okay but can be error-prone, on more nicer CSV and Excel file handling, I like PowerShell module "ImportExcel", have a look at https://github.com/dfinke/ImportExcel
    2. Furthermore, I assume you have latest ISHRemote installed (v0.12 at the time of writing) where some of your code can be a bit shorter... see section "General" on https://github.com/sdl/ISHRemote/releases/tag/v0.7 ... instead of writing $title = ($target_obj.IshField | where Name -eq FTITLE | where ValueType -eq value).value ... you should write $title = Get-IshMetadataField -Name FTITLE -Level Logical -ValueType Value ... or you can also use the read-only PSNoteProperties $title = $target_obj.ftitle_logical_value 
    3. $resolution.Length is handled clearer by an if statement on $_.IshType -eq ISHIllustration

    Best wishes,
    Dave

    PS: I'm pressing Reply now, as I hope you appreciate this rough early feedback more than a late elaborate feedback... :-) 

  • Hi Dave,

    Thank you for your prompt reply. The objective of my PowerShell scripts is to add/modify metadata efficiently for hundreds of objects included in a publication. We configured our TridionDocs to be able to have some product oriented information such as model name, market region, country, keyword, etc. In general, many objects in a publication have same metadata. Therefore, we can improve metadata editing task by exporting metadata, editing them by Excel, and importing them back to TridionDocs.

    The CSV file used by the getting metadata PowerShell is generated by our special DITA-OT plugin and tool. The DITA-OT plugin generates a HTML file which shows a list of objects in a publication according to a bookmap. The HTML file includes a GUID and version of each object. The tool extracts GUID and version pare information from the HTML file and generates a CSV file. The order of the objects in the list is important. Because, we often modify DB object names depending on chapter in which the objects locate.

    I understand there are some points that can be improved. I'll try according to your suggestions.

    Thanks a lot, Dave.

    Kind regards,

    Naoki