Monday, March 16, 2026

Using Hugging Face ONNX models in Oracle 26ai and Fixing the ONNX

 To create Vector and use RAG in Oracle 23/26ai we can encode data using api or load ONNX formatted model and create datamining model in the DB using DBMS_VECTOR.LOAD_ONNX_MODEL (or with OML4Py). 

The ONNX option, usually, has better performance.

*A little shorter syntax might have been using LOAD_ONNX_MODEL_CLOUD. It doesn't influence the core of this post.

I had problems loading few such onnx models from HuggingFace, so I tried to compare one to a default model that is provided by Oracle, all_MiniLM_L12 (see Now Available! Pre-built Embedding Generation model for Oracle Database 26ai). 

I used Oracle Code Assist (with Cline and Codex) to compare Oracle provided version and Hugging Face version and write major portions of this blog.

Intro - How I started:

Uploaded relevant ONNX file to OCI bucket. Created PAR (Pre Authenticated Request) for the bucket (or the file). 

Used DBMS_CLOUD.CREATE_CREDENTIAL to connect it.

BEGIN

  DBMS_CLOUD.CREATE_CREDENTIAL(

    credential_name => 'OCI_OBJ_CRED',

    user_ocid       => 'ocid1.user.oc1… ',

    tenancy_ocid    => 'ocid1.tenancy.oc1… ',

    private_key     => 'THEkeyString==',

    fingerprint     => '57:45:f'

  );

END;

/

Created a link to the file using DBMS_CLOUD.GET_OBJECT 

BEGIN

  DBMS_CLOUD.GET_OBJECT(

    credential_name => 'OCI_OBJ_CRED',

    object_uri      => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/p

                        /all_MiniLM_L12.onnx',

    directory_name  => 'DATA_PUMP_DIR',

    file_name       => 'all_MiniLM_L12.onnx'

  );

END;

/

Loaded the model using DBMS_VECTOR.LOAD_ONNX_MODEL

BEGIN

  DBMS_VECTOR.LOAD_ONNX_MODEL(

    model_name => 'all_MiniLM_L12',

    directory  => 'DATA_PUMP_DIR',

    file_name  => 'all_MiniLM_L12.onnx'

  );

END;

/

It worked well for Oracle’s version of all_MiniLM_L12.onnx but gave me error with the version I downloaded from Hugging Face. 

Got:

ORA-54408: The "input.input_text" field in the JSON metadata is not a model input.

ORA-06512: at "SYS.DBMS_VECTOR", line 3033

ORA-06512: at "SYS.DBMS_DATA_MINING", line 369

ORA-06512: at "SYS.DBMS_DATA_MINING", line 5795

ORA-06512: at "SYS.DBMS_VECTOR", line 3025

ORA-06512: at line 2

https://docs.oracle.com/error-help/db/ora-54408/

Error at Line: 1 Column: 1


After fixing the second onnx version as described below, this version of LOAD_ONNX_MODEL worked:

BEGIN

  DBMS_VECTOR.LOAD_ONNX_MODEL(

    directory  => 'DATA_PUMP_DIR',

    file_name  => 'all_MiniLM_L12_v2.onnx',

    model_name => 'all_MiniLM_L12_v2',

    metadata   => JSON('{

      "function":"embedding",

      "embeddingOutput": "sentence_embedding",

      "input":{"input_text": ["DATA"]}

    }')

  );

END;

/


Core solution:  ONNX + Oracle 23/26ai Embeddings: A Practical Field Guide

If you load custom ONNX embedding models with `DBMS_VECTOR.LOAD_ONNX_MODEL`, model validity alone is **not** enough.

You must align three things:

1. ONNX tensor contract (input/output names, dtypes, ranks)

2. Oracle metadata JSON schema (`embeddingOutput`, `input` mapping)

3. SQL alias mapping used in `VECTOR_EMBEDDING ... USING ... AS ...`

This guide summarizes a real troubleshooting path and the final repeatable pattern.


## 1) Why this path matters

In Oracle 23/26ai, you can generate embeddings either via:

  • external APIs
  • using OML4Py 
  • by loading ONNX models directly into DB (this post)

The ONNX route often has better control and performance but is strict about model contract compatibility.

## 2) Baseline sanity check (known-good sample)

Use Oracle’s provided MiniLM sample as your baseline:

- `all_MiniLM_L12_v2_GOOD.onnx`

If baseline works and your custom model fails, your environment is likely fine, and the issue is model contract/metadata.

## 3) Error progression and what it usually means

### ORA-54446

`The embedding input tensor "input_ids" has an invalid element data type.`

**Meaning:** token-ID input model (`input_ids`, `attention_mask`, etc.) is being loaded into a text-input embedding path.


### ORA-54408

`The "input.<name>" field in the JSON metadata is not a model input.`

**Meaning:** metadata input mapping key doesn’t match real ONNX input tensor name.


### ORA-54435

`Invalid JSON field: "embedding"`

**Meaning:** wrong metadata key for your DB release. Use `embeddingOutput`.


### ORA-54442

`Incorrect batch size location in tensor "input_text"`

**Meaning:** input rank/shape is wrong for Oracle batch expectation.

Common failing shape: `input_text: STRING [1, num_sentences]`.

Common working shape: `input_text: STRING ['batch_size']`.


## 4) Working ONNX contract (MiniLM sample)

- input name: `input`

- input type/shape: `STRING ['batch_size']`

- output name: `embedding`

- output type/shape: `FLOAT ['batch_size', 384]`


## 5) Canonical metadata JSON (for this DB style)

```json

{

  "function": "embedding",

  "embeddingOutput": "embedding",

  "input": {"input": ["DATA"]}

}

```

Notes:

- `embeddingOutput` must match a real ONNX output tensor name.

- `input` mapping key must match a real ONNX input tensor name.

- `DATA` is the SQL alias used in scoring (`USING ... AS DATA`).


## 6) General ORA-54442 remediation (Good -> Good2 patch)

If your converted model fails with ORA-54442 and has rank-2 text input, patch input to rank-1 batch.

```python

import onnx

src = r"C:\path\to\your_model_Good.onnx"

dst = r"C:\path\to\your_model_Good2.onnx"

m = onnx.load(src)

i = m.graph.input[0]

# Force rank-1 batch input for text tensor

del i.type.tensor_type.shape.dim[:]

d = i.type.tensor_type.shape.dim.add()

d.dim_param = "batch_size"

onnx.checker.check_model(m)

onnx.save(m, dst)

print("saved", dst)

```

Quick validation:

```python

import onnx

m = onnx.load(r"C:\path\to\your_model_Good2.onnx")

i = m.graph.input[0]

o = m.graph.output[0]

print("IN", i.name, [d.dim_param if d.dim_param else d.dim_value for d in i.type.tensor_type.shape.dim])

print("OUT", o.name, [d.dim_param if d.dim_param else d.dim_value for d in o.type.tensor_type.shape.dim])

```


## 7) ONNX pre-check script (before upload)

```python

# save as: check_oracle_adb_onnx.py

from pathlib import Path

import sys

import onnx

TYPES = {

    1: "FLOAT", 2: "UINT8", 3: "INT8", 4: "UINT16", 5: "INT16",

    6: "INT32", 7: "INT64", 8: "STRING", 9: "BOOL", 10: "FLOAT16",

    11: "DOUBLE", 12: "UINT32", 13: "UINT64", 16: "BFLOAT16",

}

def shape_list(value_info):

    t = value_info.type.tensor_type

    return [d.dim_param if d.dim_param else d.dim_value for d in t.shape.dim]

def dtype_name(value_info):

    return TYPES.get(value_info.type.tensor_type.elem_type, str(value_info.type.tensor_type.elem_type))

def check(path: Path) -> int:

    m = onnx.load(str(path))

    print(f"\n=== {path.name} ===")

    print("Inputs:")

    for i in m.graph.input:

        print(f"  - {i.name}: {dtype_name(i)}, shape={shape_list(i)}")

    print("Outputs:")

    for o in m.graph.output:

        print(f"  - {o.name}: {dtype_name(o)}, shape={shape_list(o)}")

    errors = []

    if len(m.graph.input) != 1:

        errors.append("Expected exactly 1 model input.")

    else:

        inp = m.graph.input[0]

        if dtype_name(inp) != "STRING":

            errors.append(f"Input dtype should be STRING, got {dtype_name(inp)}.")

        if len(shape_list(inp)) != 1:

            errors.append(f"Input rank should be 1 [batch_size], got rank {len(shape_list(inp))}.")

    if len(m.graph.output) < 1:

        errors.append("Expected at least 1 model output.")

    else:

        out = m.graph.output[0]

        if dtype_name(out) not in ("FLOAT", "FLOAT16", "BFLOAT16"):

            errors.append(f"Output dtype should be float-like, got {dtype_name(out)}.")

        if len(shape_list(out)) != 2:

            errors.append(f"Output rank should be 2 [batch_size, dim], got rank {len(shape_list(out))}.")

    if errors:

        print("\nRESULT: FAIL")

        for e in errors:

            print("  *", e)

        return 1

    print("\nRESULT: PASS (heuristically compatible)")

    return 0

if __name__ == "__main__":

    if len(sys.argv) != 2:

        print("Usage: py -3 check_oracle_adb_onnx.py <model.onnx>")

        sys.exit(2)

    sys.exit(check(Path(sys.argv[1])))

```


Run:

```bash

py -3 check_oracle_adb_onnx.py "C:\path\to\your_model.onnx"

```

## 8) Working SQL pattern (MiniLM sample)

```sql

BEGIN

  DBMS_CLOUD.GET_OBJECT(

    credential_name => 'OCI_OBJ_CRED',

    object_uri      => 'https://objectstorage.<region>.oraclecloud.com/n/<ns>/b/<bucket>/o/all_MiniLM_L12_v2_GOOD.onnx',

    directory_name  => 'DATA_PUMP_DIR',

    file_name       => 'all_MiniLM_L12_v2_GOOD.onnx'

  );

END;

/

```


```sql

BEGIN

  DBMS_VECTOR.LOAD_ONNX_MODEL(

    directory  => 'DATA_PUMP_DIR',

    file_name  => 'all_MiniLM_L12_v2_GOOD.onnx',

    model_name => 'ALL_MINILM_L12_V2_GOOD',

    metadata   => JSON('{

      "function": "embedding",

      "embeddingOutput": "embedding",

      "input": {"input": ["DATA"]}

    }')

  );

END;

/

```


```sql

SELECT VECTOR_EMBEDDING(ALL_MINILM_L12_V2_GOOD USING 'Hello World' AS DATA) AS emb

FROM dual;

```

## 9) Operational checklist

1. Inspect ONNX input/output names before load.

2. Prefer explicit metadata over defaults.

3. Ensure `embeddingOutput` key matches your DB release.

4. If ORA-54442 appears, patch rank-2 text input to rank-1 batch.

5. Treat ORA errors as contract diagnostics, not random failures.


## Final takeaway

Reliable in-DB embeddings depend on strict interface alignment:

**ONNX tensor contract + Oracle metadata contract + SQL alias mapping**

Once aligned, `LOAD_ONNX_MODEL` and `VECTOR_EMBEDDING` become repeatable and predictable.


Alternative approach of  OML4Py:  Running Hugging Face models inside Oracle 23ai with ONNX and OML4Py by fsarcosdb



No comments:

Post a Comment