SharePoint Premium (formerly Microsoft Syntex) Processing Models
This article will give you the nuts and bolts of how Syntex does what it does and goes over the human interaction with the AI tool.
Syntex is a content understanding solution. Humans create AI models that identify documents and extract information out of those documents to store as a content type and metadata in SharePoint. If a Syntex model is applied to a library, then you simply drag and drop documents into the library (or mass upload via .csv file) and the model will automatically assign it a content type and extract the metadata for you. It will also apply a sensitivity and/or retention label if you require that.
Two different methods (currently) exist in Microsoft Syntex. A document understanding model (also referred to as unstructured document processing) relies heavily on finding content based on keywords. A form processing model (also referred to as structured document processing) relies on the format of the document to find information.
Examples include:
2. Form (structured) processing model
Examples include:
Training your model
To start building either type of model, you’ll need at least five documents that are alike. That means that you identify all of them as a “policy” (for example). Then you will need at least one document that isn’t a policy. You click through each document and say “yes” or “no” so that the AI can learn what a policy either looks like (form processing) or what keywords it contains (document understanding).
Once you are confident that the AI can identify the content type of the document, you can teach it to find metadata in the document. You create what is called a classifier and then you click through the documents and show the AI where the information you are looking for is located inside each document by highlighting it. It will look for labels preceding the information and it will look at where the information is (for example, the top of the document).
Document understanding models may take a little longer to train because the information isn’t always in the same location (upper right-hand corner, for example) or the information may be duplicated in the document, so you may need to narrow down where the AI looks for the data. For example, we can just tell the AI to look in the first half of the first page of the document for the information you are trying to extract.
You’ll need five documents (at least) to test the model thoroughly. So, at least six to train it (five that are a yes and one that is a no) and five more to test it. Once you are confident that the model is successfully assigning the right content type and extracting the metadata, then you apply that model to a library and watch it work by dragging and dropping documents into it (or mass uploading).
Possible Syntex scenarios:
But that’s only the beginning. Once you have the metadata extracted then you can use it to fire off workflows to do all sorts of things with the documents:
A note on licensing: For each Syntex per-user license, you are allocated 3,500 AI Builder credits per license, per month pooled at the tenant level. Document understanding can be done by anyone with a Syntex license assigned to them, but Form processing requires a capacity add-on to your Power Platform subscription. Capacity must be allocated to the Power Apps environment where you will use AI Builder. To create a Form processing model, I’ve only ever had to ask for a Power Apps license (per user) to be assigned to me. The Power Automate per user with attended RPA plan includes AI Builder capacity as well.
To read more or to learn about the prebuilt models that come with Syntex, reference this Microsoft Learn article - Overview of model types in Microsoft Syntex - Microsoft Syntex | Microsoft Learn
Other articles in this series:
Part 1 - What does Microsoft Syntex Do?
Info/Data Gov Expert • GRC • Mentor • Microsoft Partner
2yThis is a great breakdown!