Skip to content

Oracle Connector

The Oracle connector supports reading from and writing to Oracle databases using the python-oracledb driver.


Supported Versions

Oracle 11g through 23c.

Driver

python-oracledb (oracledb>=2.0). Uses THIN mode by default (pure Python, no Oracle Client needed, supports Oracle 12c+). For Oracle 11g, THICK mode requires the Oracle Instant Client libraries.


Connection Configuration

<source connector="oracle">
  <connection host="oracle-server"
              port="1521"
              database="ORCL"
              username="${env:ORACLE_USER}"
              password="${env:ORACLE_PASS}" />
  <connectorConfig>
    <option name="service_name" value="ORCLPDB1" />
  </connectorConfig>
</source>
Parameter Default Description
host Required Database hostname or IP.
port 1521 Database port.
database -- Oracle SID. Required if service_name is not set.
username -- Credential via ${env:VAR}.
password -- Credential via ${env:VAR}.

Connector Config Options

Option Description
service_name Oracle service name (preferred over SID).
oracle_client_lib Path to Oracle Instant Client libraries for THICK mode.

THICK Mode (11g Support)

For Oracle 11g and older versions, set oracle_client_lib to enable THICK mode:

<connectorConfig>
  <option name="oracle_client_lib" value="/opt/oracle/instantclient_21_12" />
</connectorConfig>

THICK mode is initialised once per process.


Performance Features

batcherrors (Writes)

The OracleDataWriter uses executemany with batcherrors=True and 10k-row chunking. This allows partial batch success -- rows that fail (e.g. constraint violations) are logged without aborting the entire batch.

setinputsizes

The writer calls setinputsizes() for float columns to ensure proper precision handling.

Vectorized Column Extraction

Uses Arrow's to_pylist() for C-level column extraction.


Identifier Length Handling

Oracle 11g and 12c limit identifiers to 30 characters. The connector automatically detects ORA-00972 (identifier too long) and retries with shortened names:

  • Names exceeding 30 characters are truncated with a 4-character MD5 hash suffix.
  • Example: eventvisualization_categoryoptiongroupsetdimensions becomes eventvisualization_cate_a3f2
  • The mapping is tracked so that subsequent operations (writes, indexes, FKs) use the correct names.

Oracle 12c Release 2+ supports 128-character identifiers by default.


Default Schema

The default schema is the connected user (owner), determined by SELECT user FROM dual.


Type Mapping

See Type Mappings > Oracle for the complete mapping tables.

Key points:

  • NUMBER without precision maps to decimal128(38, 10).
  • NUMBER(p, 0) with scale 0 maps to native integer types (int16, int32, int64) based on precision.
  • Oracle DATE includes time, so it maps to timestamp[us] (not date32).
  • PK columns that would be CLOB or BLOB are downsized to VARCHAR2(255).

DDL Retry Logic

Error Recovery
ORA-00972 (identifier too long) Retry with shortened names (30-char limit with hash suffix)
ORA-02000 (missing ALWAYS keyword) Retry without GENERATED BY DEFAULT AS IDENTITY (pre-12c)

Known Limitations

  • FK constraint names are prefixed with the table name and limited to 30 characters for Oracle 11g/12c compatibility.
  • ON UPDATE CASCADE is not supported by Oracle. Only ON DELETE CASCADE and ON DELETE SET NULL are applied.
  • Index creation skips Oracle-specific errors: ORA-01408 (already indexed), ORA-00955 (name collision), ORA-01450 (key too long), ORA-02327 (cannot index LOB column).

Example BDL

<?xml version="1.0" encoding="UTF-8"?>
<bani schemaVersion="1.0">
  <project name="oracle-to-pg" />
  <source connector="oracle">
    <connection host="oracle-server" port="1521"
                username="${env:ORACLE_USER}"
                password="${env:ORACLE_PASS}" />
    <connectorConfig>
      <option name="service_name" value="ORCLPDB1" />
    </connectorConfig>
  </source>
  <target connector="postgresql">
    <connection host="localhost" port="5432"
                database="analytics"
                username="${env:PG_USER}"
                password="${env:PG_PASS}" />
  </target>
  <options>
    <batchSize>100000</batchSize>
    <parallelWorkers>4</parallelWorkers>
  </options>
</bani>