It happens quite often - one gets an Excel (or Word or other) file with tags inside. Some of those are html, so cold be easily parsed with filter already present there.
And when we parse embedded content of a xml file with html sometimes some elements are still left there, like text in {} or signs like %s or %1 - but parsing them in this situation isn't possible anymore. So make also please the html file types equipped with embedded content for special purposes.