Oracle Connector¶
The Oracle connector supports reading from and writing to Oracle databases using the python-oracledb driver.
Supported Versions¶
Oracle 11g through 23c.
Driver¶
python-oracledb (oracledb>=2.0). Uses THIN mode by default (pure Python, no Oracle Client needed, supports Oracle 12c+). For Oracle 11g, THICK mode requires the Oracle Instant Client libraries.
Connection Configuration¶
<source connector="oracle">
<connection host="oracle-server"
port="1521"
database="ORCL"
username="${env:ORACLE_USER}"
password="${env:ORACLE_PASS}" />
<connectorConfig>
<option name="service_name" value="ORCLPDB1" />
</connectorConfig>
</source>
| Parameter | Default | Description |
|---|---|---|
host |
Required | Database hostname or IP. |
port |
1521 |
Database port. |
database |
-- | Oracle SID. Required if service_name is not set. |
username |
-- | Credential via ${env:VAR}. |
password |
-- | Credential via ${env:VAR}. |
Connector Config Options¶
| Option | Description |
|---|---|
service_name |
Oracle service name (preferred over SID). |
oracle_client_lib |
Path to Oracle Instant Client libraries for THICK mode. |
THICK Mode (11g Support)¶
For Oracle 11g and older versions, set oracle_client_lib to enable THICK mode:
<connectorConfig>
<option name="oracle_client_lib" value="/opt/oracle/instantclient_21_12" />
</connectorConfig>
THICK mode is initialised once per process.
Performance Features¶
batcherrors (Writes)¶
The OracleDataWriter uses executemany with batcherrors=True and 10k-row chunking. This allows partial batch success -- rows that fail (e.g. constraint violations) are logged without aborting the entire batch.
setinputsizes¶
The writer calls setinputsizes() for float columns to ensure proper precision handling.
Vectorized Column Extraction¶
Uses Arrow's to_pylist() for C-level column extraction.
Identifier Length Handling¶
Oracle 11g and 12c limit identifiers to 30 characters. The connector automatically detects ORA-00972 (identifier too long) and retries with shortened names:
- Names exceeding 30 characters are truncated with a 4-character MD5 hash suffix.
- Example:
eventvisualization_categoryoptiongroupsetdimensionsbecomeseventvisualization_cate_a3f2 - The mapping is tracked so that subsequent operations (writes, indexes, FKs) use the correct names.
Oracle 12c Release 2+ supports 128-character identifiers by default.
Default Schema¶
The default schema is the connected user (owner), determined by SELECT user FROM dual.
Type Mapping¶
See Type Mappings > Oracle for the complete mapping tables.
Key points:
NUMBERwithout precision maps todecimal128(38, 10).NUMBER(p, 0)with scale 0 maps to native integer types (int16,int32,int64) based on precision.- Oracle
DATEincludes time, so it maps totimestamp[us](notdate32). - PK columns that would be
CLOBorBLOBare downsized toVARCHAR2(255).
DDL Retry Logic¶
| Error | Recovery |
|---|---|
ORA-00972 (identifier too long) |
Retry with shortened names (30-char limit with hash suffix) |
ORA-02000 (missing ALWAYS keyword) |
Retry without GENERATED BY DEFAULT AS IDENTITY (pre-12c) |
Known Limitations¶
- FK constraint names are prefixed with the table name and limited to 30 characters for Oracle 11g/12c compatibility.
ON UPDATE CASCADEis not supported by Oracle. OnlyON DELETE CASCADEandON DELETE SET NULLare applied.- Index creation skips Oracle-specific errors:
ORA-01408(already indexed),ORA-00955(name collision),ORA-01450(key too long),ORA-02327(cannot index LOB column).
Example BDL¶
<?xml version="1.0" encoding="UTF-8"?>
<bani schemaVersion="1.0">
<project name="oracle-to-pg" />
<source connector="oracle">
<connection host="oracle-server" port="1521"
username="${env:ORACLE_USER}"
password="${env:ORACLE_PASS}" />
<connectorConfig>
<option name="service_name" value="ORCLPDB1" />
</connectorConfig>
</source>
<target connector="postgresql">
<connection host="localhost" port="5432"
database="analytics"
username="${env:PG_USER}"
password="${env:PG_PASS}" />
</target>
<options>
<batchSize>100000</batchSize>
<parallelWorkers>4</parallelWorkers>
</options>
</bani>