Microsoft Graph Plugin

The Microsoft Graph plugin (tika-pipes-microsoft-graph) provides a fetcher that retrieves files from OneDrive, SharePoint, and other Graph-accessible sources. It is fetcher-only — pair it with another emitter and iterator.

Interface Component name Class

Fetcher

microsoft-graph-fetcher

MicrosoftGraphFetcher

Credentials

The fetcher authenticates against Microsoft Entra (Azure AD) using one of two credential modes — set exactly one:

  • Client secret (clientSecretCredentialsConfig) — easiest to set up; client secrets rotate manually.

  • Client certificate (clientCertificateCredentialsConfig) — for environments that require certificate-based auth.

Both modes need the same three identity fields: tenantId, clientId, plus either clientSecret or certificate.

Microsoft Graph Fetcher (microsoft-graph-fetcher)

Fetches files via the Microsoft Graph API. The fetch key encodes the Graph object identifier.

{
  "fetchers": {
    "msgf": {
      "microsoft-graph-fetcher": {
        "clientSecretCredentialsConfig": {
          "tenantId": "REDACTED-TENANT-UUID",
          "clientId": "REDACTED-CLIENT-UUID",
          "clientSecret": "REDACTED"
        },
        "scopes": ["https://graph.microsoft.com/.default"],
        "spoolToTemp": true
      }
    }
  }
}

Configuration

Field Default Description

clientSecretCredentialsConfig

required (XOR)

Nested object with tenantId, clientId, clientSecret. See Credentials.

clientCertificateCredentialsConfig

required (XOR)

Nested object with tenantId, clientId, certificate. See Credentials.

scopes

empty

OAuth scopes to request. Typical: ["https://graph.microsoft.com/.default"] (application permissions).

spoolToTemp

false

If true, files are spooled to a temp file before being parsed.

throttleSeconds

optional

Rate-limit array — consecutive failures sleep for the corresponding number of seconds.

Notes

  • The plugin uses the official microsoft-graph SDK.

  • For most service-to-service workflows, use application permissions (https://graph.microsoft.com/.default scope) — delegated permissions require an interactive flow that the fetcher does not support.

  • Client secrets are sensitive — use environment-variable substitution or external secret stores rather than inlining them in source control.