Search.PerformSearch in Knowledge Center 12

I am using the API documentation for Tridion docs 13, but I am using Knowledge Center 12. About Search.PerformSearch, it says that results are no longer cached and the caller has to manage retrieving results in chunks. I'm guessing that what I am seeing is the old behavior.

If I request 0 hits, it returns the number of hits, without the results. But, if I request 1000, it returns the first 1000, over and over, rather than retrieving the next chunk.

If I request all of the hits, it returns them all in that call, e.g. PerformSearch(query, 3214, out result), returns 3214 hits.

How can I retrieve results in chunks? Or, can I? 

 

  • Hi Kendall,

    If you want to retrieve them in chunks as referenced in the documentation they actually mean that you have to use the following flow. (although it is not really chunked).

    Either request all hits and only retrieve the metadata for blocks of 1000. You can in this approach cache the results so you do not have to re-run the query.

    1/ Request all hits for your search query.
    2/ Retrieve metadata for the items 1 to 1000
    3/ Retrieve metadata for the items 1001 to 2000
    ....

    Either request for each new chunk that you need to retrieve again your query with the number of hits increased.

    1/ Request 1000 hits for your search query.
    2/ Retrieve the metadata for the items 1 to 1000
    3/ Request 2000 hits for your search query.
    4/ Retrieve the metadata only for the items from 1001 to 2000.
    etc....

    As said it is not really chunked, but you can at least limit the number of objects for which the metadata needs to be returned.

    Kind Regards,

    Raf

  • How do you retrieve items 1001 and up? What I see is this:

    PerformSearch(query, 0, out result) -> returns 3124
    PerformSearch(query, 1000, out result) -> result contains metadata for 1 - 1000
    PerformSearch(query, 1000, out result) -> result contains metadata 1 - 1000, again, not 1001 - 2001

    Or:

    PerformSearch(query, 1000, out result) -> result contains metadata for 1 - 1000
    PerformSearch(query, 2000, out result) -> result contains metadata 1 - 2000

    Or:

    PerformSearch(query, 999999999, out result) -> result contains metadata for 1 - 3124

    Or:

    var n = PerformSearch(query, 0, out result) -> n is 3124
    PerformSearch(query, n, out result) -> result contains metadata for 1 - 3124

    So, I guess I should always request all the results?
  • As Raf described, there's no way -- using the API alone -- to chunk the results (though you can chunk the subsequent requests for metadata). If you suspect the results will be limited (the 3,214 you used in your example), I would just get the results for everything. However, I have had use cases where I expect large results from a search (in the hundreds of thousands). For these use cases, I use a query based on the CREATED-ON or MODIFIED-ON date-times, and run run my query by a given set of dates (for example, one month at a time). This allows for a certain sense of enforced chunking. My code makes iterative calls to the search function to collect the results for each date-time span.
  • Yes, I'm worried about large result sets. The documentation for PerformSearch says "The tool which is executing this function is responsible for caching the query and for example requesting more hits if the previous maximum number of hits has been met."

    How does one request more hits, as it says?

    Or, is it not possible? Then something like guessing date ranges may or may not limit results to an appropriate size, whatever that is.
  • I'm still hoping to find an interpretation of "requesting more hits", and if that implies, from a single query string. But, I'm taking the responses to mean there is no alternative to losing data from a query, or creating results sets that are too large.

    So, for now, I've written a crawler to gather the data that my query was meant to (very slowly).
  • Hi, Kendall. Not sure if the question was directed to me or to Raf. I don't know what "requesting more hits" means. I've built my searches the same way as you -- basically a crawler, in my case based on MODIFIED-ON or CREATED-ON values. I didn't see any other way to do it.
  • Hi Kendall,

    Unfortunately the statement ‘Requesting more hits’ means that you rerun your existing query with the max hits to return parameter set to a higher number.

    So this will return the higher number of hits and you only need to visualize the hits from your previous threshold onwards. So in the samples provided above it would be like.


    PerformSearch(query, 1000, out result) -> result contains metadata for 1 – 1000
    Just visualize all results in your list.

    PerformSearch(query, 2000, out result) -> result contains metadata 1 – 2000
    ONLY visualize results from entry 1001 to 2000.

    There is no other way to mimic the pagination/requesting more hits at this point in time.
    You might want to add this feature request (better support for paginated search results) to the SDL Tridion Docs Idea forum (community.sdl.com/.../).

    Kind Regards,

    Raf