SharePoint Premium (formerly Microsoft Syntex) Processing Models

Tiffany Songvilay

MVP | Modern Workplace Guru | Copilot for M365 | SharePoint Premium | Intranet Migrations and Modernizations | Power Platform | Viva Suite | LGBTQ+ | @tiffsongvilay

Published May 8, 2023

This article will give you the nuts and bolts of how Syntex does what it does and goes over the human interaction with the AI tool.

Syntex is a content understanding solution. Humans create AI models that identify documents and extract information out of those documents to store as a content type and metadata in SharePoint. If a Syntex model is applied to a library, then you simply drag and drop documents into the library (or mass upload via .csv file) and the model will automatically assign it a content type and extract the metadata for you. It will also apply a sensitivity and/or retention label if you require that.

Two different methods (currently) exist in Microsoft Syntex. A document understanding model (also referred to as unstructured document processing) relies heavily on finding content based on keywords. A form processing model (also referred to as structured document processing) relies on the format of the document to find information.

Document (unstructured) understanding model

Used for inconsistent file formats
Trainable classifier (to identify content type) with optional extractors (extract metadata)
Multiple models can be applied to the same library

Examples include:

Policies and Procedures
Letters
Contracts
Request for Proposal Documents
Project Documents

2. Form (structured) processing model

Used for semi-structured file formats
Settable classifier (associates a page layout with a content type)
Only one model can be applied per library (but can co-exist with multiple unstructured models in the same library)

Examples include:

Purchase Orders
Quotes and Invoices
Applications

Training your model

To start building either type of model, you’ll need at least five documents that are alike. That means that you identify all of them as a “policy” (for example). Then you will need at least one document that isn’t a policy. You click through each document and say “yes” or “no” so that the AI can learn what a policy either looks like (form processing) or what keywords it contains (document understanding).

Once you are confident that the AI can identify the content type of the document, you can teach it to find metadata in the document. You create what is called a classifier and then you click through the documents and show the AI where the information you are looking for is located inside each document by highlighting it. It will look for labels preceding the information and it will look at where the information is (for example, the top of the document).

Document understanding models may take a little longer to train because the information isn’t always in the same location (upper right-hand corner, for example) or the information may be duplicated in the document, so you may need to narrow down where the AI looks for the data. For example, we can just tell the AI to look in the first half of the first page of the document for the information you are trying to extract.

You’ll need five documents (at least) to test the model thoroughly. So, at least six to train it (five that are a yes and one that is a no) and five more to test it. Once you are confident that the model is successfully assigning the right content type and extracting the metadata, then you apply that model to a library and watch it work by dragging and dropping documents into it (or mass uploading).

Possible Syntex scenarios:

Procurement process – extracting metadata from documents required to approve a new vendor
Records management – applying retention to incoming documents
Improving search by capturing keywords in documents
Identifying sensitive information by finding it when the document is uploaded and then securing the content with a sensitivity label

But that’s only the beginning. Once you have the metadata extracted then you can use it to fire off workflows to do all sorts of things with the documents:

Move them to a new, secured location
Route them for approval to the right person
Send a notification that all documents have been received

A note on licensing: For each Syntex per-user license, you are allocated 3,500 AI Builder credits per license, per month pooled at the tenant level. Document understanding can be done by anyone with a Syntex license assigned to them, but Form processing requires a capacity add-on to your Power Platform subscription. Capacity must be allocated to the Power Apps environment where you will use AI Builder. To create a Form processing model, I’ve only ever had to ask for a Power Apps license (per user) to be assigned to me. The Power Automate per user with attended RPA plan includes AI Builder capacity as well.

To read more or to learn about the prebuilt models that come with Syntex, reference this Microsoft Learn article - Overview of model types in Microsoft Syntex - Microsoft Syntex | Microsoft Learn

SharePoint Premium (formerly Microsoft Syntex) Processing Models

Tiffany Songvilay

MVP | Modern Workplace Guru | Copilot for M365 | SharePoint Premium | Intranet Migrations and Modernizations | Power Platform | Viva Suite | LGBTQ+ | @tiffsongvilay

More articles by Tiffany Songvilay

Insights from the community

Explore topics

More articles by Tiffany Songvilay

Measuring the Impact of M365 Copilot

Top 10 M365 Copilot Pilot Groups

Power Platform Prompts for Microsoft 365 Copilot

MS Teams Recording for Copilot for M365

Copilot for M365 User Experience Surveys

Reporting Tools for Copilot for M365

What Dogpool Teaches Us About Responsible AI

I'm Not Oversharing - You're Oversharing

Copilot for M365 - Free Training Resources

How Copilot for M365 can empower neurodivergent workers

Insights from the community

Explore topics