A guide how to understand Tridion Docs’ new Reuse Metrics

A guide how to understand Tridion Docs’ new Reuse Metrics

The primary benefit of Tridion Docs is efficiency and accuracy through content reuse which translates into better quality and reduced documentation creation time. If you reuse from a single source, you only have to keep that source correct.

As Tridion Docs users, we know that we are benefiting from content reuse, but how do we prove it? Documentation managers need numbers to show that the investment is paying off. This is to make the right decision to carry over and continue growth further.

Also, what if we want to reuse more? We may have some ideas for how to do it, but how do we measure if they’re working?

As your team becomes expert users of componentized content, other teams may also be interested. How do you show their management the benefits?

The product does not have an out-of-the-box feature that would allow it to cover the business needs mentioned above and show reuse level and its ROI (Return on Investment) value.

Tridion Docs community and many customers also see the need and have been asking for a reporting feature for the product focusing on reuse metrics.

Tridion Docs Product Management team put an effort into analyzing various sources of information like different community materials about technical documentation, customers’ needs, and internal domain knowledge.

We came up with a reuse formula that reflects what most customers would intuitively understand as reuse. This article explains how to understand it.

What is reuse? Where to start?

For our purpose, we count reuse of topics. The use of a topic is within a publication. A reuse is each additional publication that the topic appears in.

Let's say that you create a topic GUID-A for a publication  GUID-Z. Then you have a publication GUID-Y also with topic GUID-A. That is a reuse - in fact at that point publication GUID-Y has only reused topics. Now if we create a new topic, topic GUID-B and add it to publication GUID-Y, then publication GUID-Y has one reused topic and one new one. (50% reuse). If we look at both publications together, then there are two "new" uses of topics and one reuse.

Imagine that later you have created a new publication GUID-X that has topic GUID-A, topic GUID-B, and a new Topic GUID-C that we created. As well we have created publication GUID-W that has topic GUID-B, topic GUID-C, and a new topic GUID-D.

Let’s calculate our reuse level for the current state of the system.

We can create an ordered list of facts when topics are added to publications that should help us in calculation process.

Now we have an ordered list of topics, and we can extend the table by adding a status column. Topic will receive status as new when it first time appears in that list and reuse starting from second and further one. We also can have a summary table containing information about all facts when topics were used (All topic references), facts when they were used the first time (new topic references), and facts when they were reused (reused topic references).

Now with this information, we can calculate the reuse degree for the current state of the system. That is going to be (5/9) * 100%=55%

It is valid to mention that timeline support is essential for data analysis over time. Having just a current level of content reuse is not enough. To gain insight, you need to see the value over time and compare it with an actual activity in the system. So, you can decide if the reuse level for product release in the last quarter was good enough or if you need to push for an increase in upcoming new release.

The table that we just built allows us to calculate the reuse level of a system for a given time interval. Let’s agree that Publication GUID-Z and Publication GUID-Y were created in quarter 1 while Publication GUID-X and Publication GUID-W were in quarter 2.

Our reuse level for quarter 1 would be: (1/3) * 100%=33%

Reuse level for quarter 2: (4/6) * 100%=66%

Storing and maintaining such a table of topics allows to calculate the reuse level for a given time interval.

Decisions for calculating reuse

While looking into varied materials and implementation variants and talking with various customers, you understand that there are many definitions and thoughts around what reuse is, how to calculate that, and what other metrics are important.

Our decisions are based on our knowledge and learnings from the tech docs community and customers.

Tracking Publications, not Publication Versions 

We can summarize the definitions that we were using in our example above.

Status

Description

New

A document reference gets “new” status if the document hasn’t yet participated in any publication by the time of adding it.

Reused

We call a document reference "reused" if, by the time of adding a document to a publication, it is used in one or more publications.

 

As you can see from our example and definition, we calculate reuse level across publications (horizontal reuse) and not a publication version (vertical reuse).

Let’s look at an example, to explain a bit in detail. Say, last year you released a new documentation as part of the new product release. Publication that was released has 10 topics in it. Those 10 facts that topics were added to publication brings a value of using a product. Imagine that this year you need to enhance it with new content and add an extra two topics. You will create a new version of the publication. At that point in time when you created a new version your content did not change, you did not create new value using a product. This is called vertical reuse across the same publication in different versions. However, once two new topics are added to this version they will be counted since they added a new value to your publication.

Once you want to deliver a new product version or new product there is a recommendation to use a new publication in this case. Each publication is an opportunity for reuse. This is called horizontal reuse that is across publications.

We analyzed the customers’ data and confirmed they were creating new publications for the mentioned case. According to the 90th percentile, the number of publication versions per publication was 1 or 2, with 4 as a max value. That means that new publication versions are created to adjust, and do a small enhancement to delivered content.

Although according to our definition, we calculate reuse level across publications and not publication versions, Tridion Docs, however, does bring value by easily tracking deltas between Publication versions.

Reuse units  

We include topics in the calculation.

You might have content that is not a part of any publication (repository documents). That might be archived (stale) content, as an example. We don’t count such documents in reuse statistics.

Usually, the topic version adds to the initially created one but does not wholly revamp content. So, we do not consider multiple topic versions as different reuse units.

Variables

Library topics are excluded. The granularity of the reuse calculation we chose is the document level. Library topics often reference smaller content parts (e.g., paragraphs).

Content references

Content references require effort to maintain them centrally which can be even higher than the value of part of the content that is reused. Content references are excluded from the calculation process.

Conditions

What about conditions? We always include conditioned content to reduce calculation round trips.

Discussing and agreeing on all definitions, and calculation processes, ruling out some ways of working (smaller draft publications or “chapter”-publications), and technical solutions took some effort. 

However, having all that aligned allows us to calculate and build different metrics charts, like content reuse and return on investment, that we will discuss in a later blog post.

As you see, to calculate reuse in an “intuitive” way, it takes some effort! Of course, the new reuse metrics feature takes care of all that for you. But why not try a fun exercise to test your knowledge of the principle?

Over these four publications, how many times topics were used overall, how many new topics are there, and how many times is a topic reused, and what is the reuse degree?

  • Hello Frieda,

    Thank you for your questions.

    The new Metrics feature will be delivered as part of the mandatory Hotfix on top of the Tridion Docs 15.1 release. 

    The goal of this blog is to explain the reuse formula that we are using.

  • I'm sorry, I'm trying to understand what is going on here.
    Is "Tridion Docs’ new Reuse Metrics" an actual existing new feature (in v15), or one still being proposed/explored/developed?
    It seems like this is just a philosophical debate about the formula.
    Are we voting on the formula?

  • Hello  

    Thank you for taking the time and read the blog.

    Let me cover your questions one by one.

    1. In our example and the way we calculate the reuse degree, we can calculate the relative reuse degree for Q1 and Q2 separately. 

    In case a reuse degree is required for a time interval that includes Q1 and Q2 then publications Z and Y will be included.

    Could you please elaborate more regarding the second part of the question?

    2. When new topics are added to Z and Y in Q2 this information will be logged and counted during the calculation process 

    Important to note that the timestamp when publication is created does not influence the calculation process, we are not aggregating data around this time stamp. A time when a topic is added to publication is important (and is being tracked) since this is the fact of reuse. These facts of reuse on a timeline can be calculated for a given time interval. 

    3. In our initial implementation indeed you will not be able to see which topics are reused. But this can be improved in future versions.

    You have an option to see a system-wide reuse degree or filter by a Product that is metadata assigned to Publication. A Group of Products can represent a Business Unit.

    I will cover UX capabilities in the following blogpost. 

    4. Content references as I explained require an effort to maintain them centrally that can be even higher than the value of reuse of the parts of the content. 

    The granularity of reuse that we chose is document level while for content reference it is only parts of the document.

  • Thank you  for your detailed info! I have the following comments and questions:

    • Regarding the comparison example between Q1 and Q2, I think the reuse degree for Q2 needs to include publications Z and Y even though their topic status didn't change. Regarding #1 and #2, let's suppose that topic F (instead of topic A) is reused. It is still reused in Q2. 
    • Is it captured if publications Z and Y (created in Q1) are updated for reuse in Q2?
    • I understand that the Reuse Metrics will not show which topics are reused but the system-wide reuse degree for all publications. Is this correct?
    • Content references are excluded, but our writers use them widely.