The Term Store and Invoice Scanning

How would you setup a taxonomy for document classification of vendor invoices?  The goal here is to identify key terms within the document that would map that invoice to a particular vendor.  Working with PSIGEN PSI:Capture and trying to use the term store functionality to auto-populate key vendor criteria and build a classification taxonomy within SharePoint.  The capture software can auto-extract data from scanned documents, like a phone number, or an email address, and then add terms based on its findings.  I want to make sure i design this right the first time.

My first swag would be:

Vendor Name

  • Phone or Fax
  • Email Address
  • Address
Once I populate this once, i can then run lookups and look for matches to pre-existing terms.   Please provide feedback on how you would proceed.