To create Vector and use RAG in Oracle 23/26ai we can encode data using api or load ONNX formatted model and create datamining model in the DB using DBMS_VECTOR.LOAD_ONNX_MODEL (or with OML4Py).
The ONNX option, usually, has better performance.
*A little shorter syntax might have been using LOAD_ONNX_MODEL_CLOUD. It doesn't influence the core of this post.
I had problems loading few such onnx models from HuggingFace, so I tried to compare one to a default model that is provided by Oracle, all_MiniLM_L12 (see Now Available! Pre-built Embedding Generation model for Oracle Database 26ai).
I used Oracle Code Assist (with Cline and Codex) to compare Oracle provided version and Hugging Face version and write major portions of this blog.
Intro - How I started:
Uploaded relevant ONNX file to OCI bucket. Created PAR (Pre Authenticated Request) for the bucket (or the file).
Used DBMS_CLOUD.CREATE_CREDENTIAL to connect it.
BEGIN
DBMS_CLOUD.CREATE_CREDENTIAL(
credential_name => 'OCI_OBJ_CRED',
user_ocid => 'ocid1.user.oc1… ',
tenancy_ocid => 'ocid1.tenancy.oc1… ',
private_key => 'THEkeyString==',
fingerprint => '57:45:f…'
);
END;
/
Created a link to the file using DBMS_CLOUD.GET_OBJECT
BEGIN
DBMS_CLOUD.GET_OBJECT(
credential_name => 'OCI_OBJ_CRED',
object_uri => 'https://objectstorage.eu-frankfurt-1.oraclecloud.com/p …
/all_MiniLM_L12.onnx',
directory_name => 'DATA_PUMP_DIR',
file_name => 'all_MiniLM_L12.onnx'
);
END;
/
Loaded the model using DBMS_VECTOR.LOAD_ONNX_MODEL
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
model_name => 'all_MiniLM_L12',
directory => 'DATA_PUMP_DIR',
file_name => 'all_MiniLM_L12.onnx'
);
END;
/
It worked well for Oracle’s version of all_MiniLM_L12.onnx but gave me error with the version I downloaded from Hugging Face.
Got:
ORA-54408: The "input.input_text"
field in the JSON metadata is not a model input.
ORA-06512: at "SYS.DBMS_VECTOR", line 3033
ORA-06512: at "SYS.DBMS_DATA_MINING", line 369
ORA-06512: at "SYS.DBMS_DATA_MINING", line 5795
ORA-06512: at "SYS.DBMS_VECTOR", line 3025
ORA-06512: at line 2
https://docs.oracle.com/error-help/db/ora-54408/
Error at Line: 1 Column: 1
After fixing the second onnx version as described below, this version of LOAD_ONNX_MODEL worked:
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
directory => 'DATA_PUMP_DIR',
file_name => 'all_MiniLM_L12_v2.onnx',
model_name => 'all_MiniLM_L12_v2',
metadata => JSON('{
"function":"embedding",
"embeddingOutput": "sentence_embedding",
"input":{"input_text": ["DATA"]}
}')
);
END;
/
Core solution: ONNX + Oracle 23/26ai Embeddings: A Practical Field Guide
If you load custom ONNX embedding models with `DBMS_VECTOR.LOAD_ONNX_MODEL`, model validity alone is **not** enough.
You must align three things:
1. ONNX tensor contract (input/output names, dtypes, ranks)
2. Oracle metadata JSON schema (`embeddingOutput`, `input` mapping)
3. SQL alias mapping used in `VECTOR_EMBEDDING ... USING ... AS ...`
This guide summarizes a real troubleshooting path and the final repeatable pattern.
## 1) Why this path matters
In Oracle 23/26ai, you can generate embeddings either via:
- external APIs
- using OML4Py
- by loading ONNX models directly into DB (this post)
The ONNX route often has better control and performance but is strict about model contract compatibility.
## 2) Baseline sanity check (known-good sample)
Use Oracle’s provided MiniLM sample as your baseline:
- `all_MiniLM_L12_v2_GOOD.onnx`
If baseline works and your custom model fails, your environment is likely fine, and the issue is model contract/metadata.
## 3) Error progression and what it usually means
### ORA-54446
`The embedding input tensor "input_ids" has an invalid element data type.`
**Meaning:** token-ID input model (`input_ids`, `attention_mask`, etc.) is being loaded into a text-input embedding path.
### ORA-54408
`The "input.<name>" field in the JSON metadata is not a model input.`
**Meaning:** metadata input mapping key doesn’t match real ONNX input tensor name.
### ORA-54435
`Invalid JSON field: "embedding"`
**Meaning:** wrong metadata key for your DB release. Use `embeddingOutput`.
### ORA-54442
`Incorrect batch size location in tensor "input_text"`
**Meaning:** input rank/shape is wrong for Oracle batch expectation.
Common failing shape: `input_text: STRING [1, num_sentences]`.
Common working shape: `input_text: STRING ['batch_size']`.
## 4) Working ONNX contract (MiniLM sample)
- input name: `input`
- input type/shape: `STRING ['batch_size']`
- output name: `embedding`
- output type/shape: `FLOAT ['batch_size', 384]`
## 5) Canonical metadata JSON (for this DB style)
```json
{
"function": "embedding",
"embeddingOutput": "embedding",
"input": {"input": ["DATA"]}
}
```
Notes:
- `embeddingOutput` must match a real ONNX output tensor name.
- `input` mapping key must match a real ONNX input tensor name.
- `DATA` is the SQL alias used in scoring (`USING ... AS DATA`).
## 6) General ORA-54442 remediation (Good -> Good2 patch)
If your converted model fails with ORA-54442 and has rank-2 text input, patch input to rank-1 batch.
```python
import onnx
src = r"C:\path\to\your_model_Good.onnx"
dst = r"C:\path\to\your_model_Good2.onnx"
m = onnx.load(src)
i = m.graph.input[0]
# Force rank-1 batch input for text tensor
del i.type.tensor_type.shape.dim[:]
d = i.type.tensor_type.shape.dim.add()
d.dim_param = "batch_size"
onnx.checker.check_model(m)
onnx.save(m, dst)
print("saved", dst)
```
Quick validation:
```python
import onnx
m = onnx.load(r"C:\path\to\your_model_Good2.onnx")
i = m.graph.input[0]
o = m.graph.output[0]
print("IN", i.name, [d.dim_param if d.dim_param else d.dim_value for d in i.type.tensor_type.shape.dim])
print("OUT", o.name, [d.dim_param if d.dim_param else d.dim_value for d in o.type.tensor_type.shape.dim])
```
## 7) ONNX pre-check script (before upload)
```python
# save as: check_oracle_adb_onnx.py
from pathlib import Path
import sys
import onnx
TYPES = {
1: "FLOAT", 2: "UINT8", 3: "INT8", 4: "UINT16", 5: "INT16",
6: "INT32", 7: "INT64", 8: "STRING", 9: "BOOL", 10: "FLOAT16",
11: "DOUBLE", 12: "UINT32", 13: "UINT64", 16: "BFLOAT16",
}
def shape_list(value_info):
t = value_info.type.tensor_type
return [d.dim_param if d.dim_param else d.dim_value for d in t.shape.dim]
def dtype_name(value_info):
return TYPES.get(value_info.type.tensor_type.elem_type, str(value_info.type.tensor_type.elem_type))
def check(path: Path) -> int:
m = onnx.load(str(path))
print(f"\n=== {path.name} ===")
print("Inputs:")
for i in m.graph.input:
print(f" - {i.name}: {dtype_name(i)}, shape={shape_list(i)}")
print("Outputs:")
for o in m.graph.output:
print(f" - {o.name}: {dtype_name(o)}, shape={shape_list(o)}")
errors = []
if len(m.graph.input) != 1:
errors.append("Expected exactly 1 model input.")
else:
inp = m.graph.input[0]
if dtype_name(inp) != "STRING":
errors.append(f"Input dtype should be STRING, got {dtype_name(inp)}.")
if len(shape_list(inp)) != 1:
errors.append(f"Input rank should be 1 [batch_size], got rank {len(shape_list(inp))}.")
if len(m.graph.output) < 1:
errors.append("Expected at least 1 model output.")
else:
out = m.graph.output[0]
if dtype_name(out) not in ("FLOAT", "FLOAT16", "BFLOAT16"):
errors.append(f"Output dtype should be float-like, got {dtype_name(out)}.")
if len(shape_list(out)) != 2:
errors.append(f"Output rank should be 2 [batch_size, dim], got rank {len(shape_list(out))}.")
if errors:
print("\nRESULT: FAIL")
for e in errors:
print(" *", e)
return 1
print("\nRESULT: PASS (heuristically compatible)")
return 0
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: py -3 check_oracle_adb_onnx.py <model.onnx>")
sys.exit(2)
sys.exit(check(Path(sys.argv[1])))
```
Run:
```bash
py -3 check_oracle_adb_onnx.py "C:\path\to\your_model.onnx"
```
## 8) Working SQL pattern (MiniLM sample)
```sql
BEGIN
DBMS_CLOUD.GET_OBJECT(
credential_name => 'OCI_OBJ_CRED',
object_uri => 'https://objectstorage.<region>.oraclecloud.com/n/<ns>/b/<bucket>/o/all_MiniLM_L12_v2_GOOD.onnx',
directory_name => 'DATA_PUMP_DIR',
file_name => 'all_MiniLM_L12_v2_GOOD.onnx'
);
END;
/
```
```sql
BEGIN
DBMS_VECTOR.LOAD_ONNX_MODEL(
directory => 'DATA_PUMP_DIR',
file_name => 'all_MiniLM_L12_v2_GOOD.onnx',
model_name => 'ALL_MINILM_L12_V2_GOOD',
metadata => JSON('{
"function": "embedding",
"embeddingOutput": "embedding",
"input": {"input": ["DATA"]}
}')
);
END;
/
```
```sql
SELECT VECTOR_EMBEDDING(ALL_MINILM_L12_V2_GOOD USING 'Hello World' AS DATA) AS emb
FROM dual;
```
## 9) Operational checklist
1. Inspect ONNX input/output names before load.
2. Prefer explicit metadata over defaults.
3. Ensure `embeddingOutput` key matches your DB release.
4. If ORA-54442 appears, patch rank-2 text input to rank-1 batch.
5. Treat ORA errors as contract diagnostics, not random failures.
## Final takeaway
Reliable in-DB embeddings depend on strict interface alignment:
**ONNX tensor contract + Oracle metadata contract + SQL alias mapping**
Once aligned, `LOAD_ONNX_MODEL` and `VECTOR_EMBEDDING` become repeatable and predictable.
Alternative approach of OML4Py: Running Hugging Face models inside Oracle 23ai with ONNX and OML4Py by fsarcosdb












