365Vision

Copilot Studio – Better SharePoint documents knowledge base

Having built quite a few CoPilot Studio Agents, I have experienced multiple project teams that were unhappy with the answer quality of Copilot Studio using the OOTB SharePoint connection as a knowledge source. So, I came up with a solution that retrieves the document from SharePoint and directly adds them to the Copilot Studio Agent. Not only does this return better answers, but it also provides you with the opportunity to filter on metadata, which the default connector does not support.

Oh, and let’s not forget Henry Jammes’ solution that helped me add citation links so that you can still open the document directly in SharePoint.

Let’s dive into it!

Proces Overview

Let’s start with an overview of the process. In short, with Power Automate, I retrieve documents from SharePoint and directly add them to my Copilot Agent. If a document has not been updated for a certain amount of time, I remove it from my Agent as the information is no longer relevant.

In my scenario I used SharePoint as a datasource but, you can imagine that using this logic other document management systems can also be used.

Although I am satisfied with this solution, I believe that Microsoft will likely provide a better solution in the near future. Therefore, I expect this entire setup to be of temporary use.

Document Sync

If you take a look at the backend of Copilot Studio, you will notice that every knowledge source is stored as a separate DataVerse item within the ‘Copilot Component’ table. However, I didn’t just want to add any document to the agent. I wanted to filter based on metadata, keep it in sync, delete items that haven’t been updated for a year, and provide direct links to SharePoint with citations.

I came up with the following custom DataVerse table to keep track of the document sync:

  • Title: Used to check if document is already present, it must be unique;
  • DataVerse ID: ID of the Copilot Component
  • Doc Updated on: To check if document is still relevant and needs updating;
  • Document Link: Used for the citation, to link to the original file.

And this Power Automate flow to facilitate the proces:

  1. Retrieve the Documents from SharePoint, you can add any filter that you like.
    • In my case we only add documents that have been updated in the last year
  2. For Each document check if you want it to be added to the Agent.
  3. Check if the document already exists:
    • Yes: Update the existing document content only.
    • No: Step 4
  4. Add a new row ‘Copilot component’ and update it with the document content
  5. Add a new row to our own DocSync table
  6. Optional: Create a clean-up process to delete documents no longer found in SharePoint

 

Compose – SchemaName

  • SchemaName : [internal bot name]_[RandomNumber] (not sure if this is needed)

 

Add a new row – to copilot components

  • ComponentType: Bot File Attachment
  • parentbotID : bots([BOTID])
  • SchemaName: Compose Output

 

Add a new row – to sync table

  • DataVerseID: ID of the Bot component from step above
  • DocumentLink: Full Link to SharePoint document, replace spaces with %20

Add the citations

With the out-of-the-box (OOTB) functionality, if you add a document directly to your agent, the citations returned are purely the chunked text where the answer was found. What I wanted for the solution was the best of both worlds: better answers while still being able to open the SharePoint document directly.

I got this to work by using part of Henry Jammes’ solution and creating an adaptive card to return the result to the user. The only downside is that you have to turn off the Orchestration / Generative AI setting, as it omits the Conversational Boosting Topic.

As you can see below, this works quite well.

To get this to work, follow these simple steps:

  1. Go to your ‘Conversational Boosting’ topic
  2. Edit the ‘Generative answers’ node to NOT return a message
  3. Add a DataVerse action to retrieve all of the items from the Doc Sync Dataverse Table we created
  4. Add an Adaptive card node and paste the code below in to it
				
					{
 type: "AdaptiveCard",
 '$schema': "http://adaptivecards.io/schemas/adaptive-card.json",
 version: "1.5",
 body: [
 {
 type: "Container",
     items: [
     {
     type: "TextBlock",
     // Generated answer text
     text: Topic.Answer.Text.Content & Char(10) & Char(10) ,
     wrap: true
     }
     ]
 },
 {
    //Loop through each citation
     type: "Container",
     items: ForAll(Topic.Answer.Text.CitationSources,
     //If there is a URL, use it
     If(!IsBlank(Url),
     {
         type: "TextBlock",
         text: "[" & Id &"]:" & Url,
         wrap: true,
         size: "Small",
         weight: "Lighter"
    },

 // If the citation has no URL, get it from dataverse and make the citation clickable and open the source URL.
     {
         type: "TextBlock",
         text: "[" & Id &"]: [" & Name & "](" & LookUp(Topic.Var1, ins_title = Name, ins_documentlink) & ")",
         wrap: true,
         size: "Small",
         weight: "Lighter"
     }
 ))
 }]
}
				
			

And that should do it! It’s quite a straightforward process which you can adapt to your own needs. While I haven’t explained everything in detail, most aspects should be self-explanatory.
Let me know if this has helped you or if you’ve got any questions!

Tips

  • Use a configration table to determine which sites and libraries you want to pull documents from;
  • Use a configuration table to determine filters;
  • If you use a Pipeline to deploy your Chatbot, you might need to reindex (remove and re-add) all documents after a deployment (you will know if the agent does not answer any questions);
  • When adding new documents to your agent the agent will have to be republished, you could do this automatically using the PAC tools (pac copilot publish). Updating documents do not require a publish.

Peter

Power Platform Enthousiast, with a current focus on Copilot Studio & AI

Leave a Reply

Your email address will not be published. Required fields are marked *