Open-core connector SDKs: why every enterprise needs custom data plumbing

Every "unified data" vendor advertises the size of their connector catalog. Three hundred. Five hundred. Six hundred and counting. Customers nod, sign the contract, and discover six weeks in that the connector their actual use case needs isn't there.

The pattern is so consistent that we keep a mental list of the systems that always block the AI rollout in our region:

Mainframe DB2 inside the regional bank.
SAP iDOC feeds with a 20-year-old custom schema.
A REST API written by an internal team in 2014 that no vendor has ever heard of.
The patient-records system the hospital network has run since 2002.
The proprietary OT historian on the manufacturing floor.
The bespoke trading platform the broker-dealer treats as a competitive moat.

You will not find a pre-built connector for any of these. You will not find one next year either. They are too niche for any vendor to prioritise, and too critical to the customer to skip.

The real question

The question isn't "does this platform have a connector for SAP?" — almost all of them do. The question is "how hard is it for my team to write the next one?".

For most enterprise data platforms today, the answer to that question is "fork the open-source repo, write code against an undocumented internal API, hope it survives the next minor-version upgrade". That is not a real answer.

A real open-core SDK is a contract:

A documented, stable interface. A small ABC / Protocol your developer extends. The shape doesn't change between minor versions.
A path to ship your connector without forking. Drop a Python file in a directory; the product picks it up at startup. No need to maintain a parallel branch.
The same code path as the built-ins. If the framework reads Postgres through the same interface your custom connector implements, you have parity. If the built-ins have privileged hooks your code can't reach, you're on a second-class path.
An explicit credential model. Your custom connector declares which fields are secrets; the framework encrypts them at rest. You don't have to invent the vault yourself.
Working examples. A reference connector you can read end-to-end in 80 lines.

The good news: this is a low bar. Many open-source projects clear it. The bad news: most commercial data platforms don't — their economics depend on charging per-connector or per-row, which is incompatible with a "you can write your own for free" SDK.

Why this is the real moat

The strategic reason this matters is that every Fortune 1000 has at least one critical data source nobody else has integrated with, and they can't get AI value out of their unified-data platform until that source is hooked up.

If your platform forces them to wait six months for the vendor's roadmap to catch up, or to pay a partner system integrator USD 500k to write the connector as a custom engagement, you lose the deal to the platform that says "a Python file in a directory; an afternoon of work; here's the docs".

And the customer reaches the same conclusion in the second meeting of the second pilot. The vendor with 600 connectors and a closed framework looks worse than the vendor with 8 connectors and an open SDK, because the customer's actual use case isn't in either catalog.

What a good SDK looks like in practice

The reference shape we like:

from wekams_lens_sdk import (
    Connector, ConnectorError,
    IntrospectedColumn, IntrospectedTable, QueryResult,
)

class HrSystemConnector(Connector):
    type = "internal_hr"
    display_name = "Internal HR system"
    credential_keys = frozenset({"api_token"})

    async def healthcheck(self) -> bool:
        return await self._ping()

    async def introspect(self) -> list[IntrospectedTable]:
        # Call the internal HR REST endpoint, build a Table per logical entity
        ...

    async def execute(self, sql, *, max_rows=10_000, timeout_seconds=60):
        # Translate the SQL into the HR system's query language
        # OR refuse and return a useful error explaining what shape is supported
        ...

Three methods. About 80–150 lines for a realistic source. The customer's senior Python developer writes this in an afternoon and ships it. The platform picks it up at startup, the natural-language agent on top now knows about a new data source, and the AI rollout unblocks.

What this means for the platform's own connector roadmap

The honest implication: once the SDK exists, the platform's first-party connector list becomes much less commercially important. The customer can write the connector themselves, or hire a competent Python contractor for two days. The platform doesn't have to be exhaustive; it just has to be extensible.

That's the trade-off you want. You don't want to be the vendor maintaining 600 connectors against vendor APIs that change every six months. You want to be the vendor whose framework is stable enough that other people maintain connectors for you.

This is the dynamic that makes Postgres, Kubernetes, and DuckDB the platforms they are. They didn't try to be exhaustive; they were extensible. The ecosystem did the rest.

How we did it in Wekams Lens

This is why Wekams Lens ships with six built-in connectors (Postgres, S3, Azure Blob, GCS, JSON-lines logs, Elasticsearch / OpenSearch) and a public SDK. The reference custom connector is a SQLite reader in ~80 lines that we ship inside the repo. Drop it in ~/.wekams/connectors/ and restart; SQLite shows up in the type picker exactly the same as Postgres. We use this same SDK ourselves to write the built-ins.

That's the test. If the SDK isn't good enough for the framework's own authors to use, it isn't good enough for the customer to use. Almost no commercial data platform passes this test today.

If you're evaluating data platforms in the next six months: ignore the connector count on the brochure. Ask for the SDK docs and the time to write a "hello world" connector against an imaginary internal REST API. The vendors that can show you an answer inside the meeting are the ones you want to consider.