Just reminding myself the Oracle Support document link...
Best Practices Guide for Tuning Oracle® Analytics Server (Doc ID 2866848.2)
This Blog is personal and independent.It does not reflect the position or policy of Oracle. It is my "external memory", that helps me remember solutions I used and links I need. Nothing more.
Just reminding myself the Oracle Support document link...
Best Practices Guide for Tuning Oracle® Analytics Server (Doc ID 2866848.2)
We had a request to save a copy of every dataset created and saved in Oracle Analytics Server (OAS). I will cover only local OAS datasets that are not in the database. Usually, this kind of requests come because of legal demands to save such data on external server.
In theory I can create a script that periodically copies the dataset files to the backup server/FTP.
The datasets files are saved on the server under
The files I want are in Domain_Home/bidata/components/DSS/storage/ssi. Only files with DSS extension are relevant. The files names are strings that have nothing to do with the dataset name in OAS.
My datasets can come from 2 different sources:
The loaded based DSS files are usually in the original files format, so renaming the DSS make them usable and in Parquet format.
The Data Flow based DSS files are in CSV format and in Parquet format.
For Parquet files we usually need a special reader utility.
How can I know what those DSS files formats are?
How can I know the real Dataset name?
The secret is in the XXX_BIFOUNDATION schema table S_NQ_DSS_DATASETS.
In that table I see different behavior for loaded files vs. Data Flow based ones.
While the Data Flow based datasets are represented by one row the Loaded datasets are represented by few rows (2 in my case).
For Data Flow based dataset, the DATASET_NAME column holds the real Dataset name, and the DATAMODEL_ID is NULL. The MIMETYPE column value is "text/csv".
In the FILESTORE_ID column I see the DSS file names. The CSV file and the Parquet file. For example:
{"src":"06a918f7-916f-49de-be33-9c046d53e869","src_parquet":"f76f107c-cb48-4853-a0d7-52839d4f194e"}
Checking the Domain_Home/bidata/components/DSS/storage/ssi I can see 2 DSS files:
Now I have 2 options to handle the files:
For Loaded files, based dataset, we have a dataset row and the real data row in the S_NQ_DSS_DATASETS table. It is very similar to what we see in the UI where we can have several tables or files in a single dataset.
For the entire dataset row the DATASET_NAME column holds the real Dataset name, and the DATAMODEL_ID is NULL. The MIMETYPE column value is NULL. We can disregard the FILESTORE_ID column here. The FILE_LENGTH column value is -1.
For the data row in S_NQ_DSS_DATASETS we can see in the DATAMODEL_ID column, the value of the dataset row DATASET_ID. By joining the 2 we can have the original dataset name for the data row.
In the data row, in the FILESTORE_ID column I see the DSS file names. The CSV file and the Parquet file. For example:
{"src":"8f815cc1-044a-4caa-9026-431be828e347","src_parquet":"261dfba0-9adc-4572-a7d7-9cd4c95dccf5"}
Checking the Domain_Home/bidata/components/DSS/storage/ssi I can see 2 DSS files:
How do I know the proper format? By using the MIMETYPE column where I have values:
Now I have 2 options to handle the files:
I recommend a select of the following type to work with the data, this way you have proper names of DSS files and in FULL Name the name of the dataset:
SELECT
a.dataset_id,
a.stripe_id,
a.dataset_name,
b.dataset_name DS_name,
b.dataset_name||(case when b.dataset_name IS NULL then '' else '-' end)||a.dataset_name FULL_NAME,
a.filestore_id,
substr(a.filestore_id, 9, instr(a.filestore_id, ',')-1-9)||'.dss' Orig_file,
case when substr(a.filestore_id, instr(a.filestore_id, ':', -1)+1,4)='null' then NULL else substr(a.filestore_id, instr(a.filestore_id, ':', -1)+2, length(a.filestore_id)-(instr(a.filestore_id, ':', -1)+2)-1 )||'.dss' end Parquet_file,
a.acl_id,
a.filetype_id,
a.mimetype,
a.owner_id,
a.createdby_id,
a.provider_id,
a.type_options_xml,
a.provider_options_xml,
a.filetype_knobs,
a.file_length,
a.storage_file_length,
a.last_modified,
a.created_date,
a.biservermetadata_last_modified,
a.rawfile_last_modified,
a.is_embryonic,
a.is_user_striped,
a.internal_id,
a.namespace,
a.connection_id,
a.partition_key,
a.refresh_option,
a.schedule_xml,
a.datamodel_id,
a.error_code,
a.error_message,
a.parent_dataset_id,
a.child_dataset_type,
a.app_module_id,
a.properties_last_modified
FROM
dev_biplatform.s_nq_dss_datasets A, dev_biplatform.s_nq_dss_datasets B
where A.file_length<>-1 and a.datamodel_id=b.dataset_id(+)
ORDER BY a.last_modified DESC
;
...Please notice the PARQUET files don't have the actual column names and use names like F0, F1
:You can find the real names in the following 2 columns
The XML in both columns are similar. I found 3 copies of files column names in different formats.
Here is TYPE_OPTIONS_XML example:
With APEX we can create a chatbot and connect it to AI data source. There are lots of good materials about it, but I was struggling with the problem how to pass the user input text as part of the RAG code. My colleague Yohay Asraf gave me the tip of using APEX$AI_LAST_USER_PROMPT that solved my problem. Thanks, Yohay!
In APEX we can now use AI options and configure RAG sources, Oracle 23ai Database features like SelectAI and OCI AI Agents. See links to few Labs at the end of the post.
In an APEX application under "Shared Components" - "Generative AI Configuration" I created a RAG query that uses the Vector option of the 23ai but how can we make it dynamic?
In my case, I loaded various car guides into a table and created chunks of the books and vectors from those chunks. Next, I'm looking for chunks of text that are similar to a specific text (Bla Bla Bla in the next example).
In the example
WITH query_vector AS (
SELECT VECTOR_EMBEDDING(doc_model USING 'Bla Bla Bla' AS data) as embedding)
SELECT embed_id, TEXT_CHUNK
FROM WKSP_APEX.VECTOR_STORE, query_vector
Where VECTOR_DISTANCE(EMBED_VECTOR, query_vector.embedding, COSINE)<0.6
AND VEHICLE_MAKE=:P2_MAKE AND VEHICLE_MODEL=:P2_MODEL
ORDER BY VECTOR_DISTANCE(EMBED_VECTOR, query_vector.embedding, COSINE)
FETCH APPROX FIRST 4 ROWS ONLY
How do I make the input string dynamic? Here we have 2 options
APEX$AI_LAST_USER_PROMPT
-
Retrieves the most recent user-entered prompt.APEX$AI_ALL_USER_PROMPTS
-
Retrieves all user prompts, concatenated into a single string.Here is the new code:
WITH query_vector AS (
SELECT VECTOR_EMBEDDING(doc_model USING :APEX$AI_LAST_USER_PROMPT AS data) as embedding)
SELECT embed_id, TEXT_CHUNK
FROM WKSP_APEX.VECTOR_STORE, query_vector
Where VECTOR_DISTANCE(EMBED_VECTOR, query_vector.embedding, COSINE)<0.6
AND VEHICLE_MAKE=:P2_MAKE AND VEHICLE_MODEL=:P2_MODEL
ORDER BY VECTOR_DISTANCE(EMBED_VECTOR, query_vector.embedding, COSINE)
FETCH APPROX FIRST 4 ROWS ONLY
As you can see, we are using 2 RAG sources Car Problems table and a Books table.
There is a nice blog by Harry Snart that uses it - RAG based Chatbots made easy
If you want to start using AI or RAG and APEX, there are many Blogs and great labs in Oracle LiveLabs.
Last week Oracle has announced the availability of Oracle DB MCP server, joining the general hype of MCP servers rather fast. Most of the materials were published by https://x.com/thatjeffsmith and https://x.com/krisrice. All I do here is gather few links from the webcasts and X for future use.
They wrote about it here: Introducing MCP Server for Oracle Database
The MCP server is part of Oracle SQLcl. It should work with all supported Oracle DB versions (On-Prem. & Cloud), meaning 19c and up.
If you want a general MCP intro: https://modelcontextprotocol.io/introduction
There were 2 webcasts. The first (and longer) one covers everything that was covered in the introduction one, so I advise watching it only.
Introducing SQL Developer Copilot Integration in Microsoft VSCode & MCP Support for Oracle Database: https://youtu.be/hj6WoZVGUBg?si=9C58_wcDBE9wkruT (1 hour).
The shorter - Oracle Developer Coaching: July News Edition https://youtu.be/Xnmty5_-v-M (33 min)
Latest version of Oracle SQLcl download link:
https://www.oracle.com/uk/database/sqldeveloper/technologies/sqlcl/download/
or direct link to the 25.2 version https://download.oracle.com/otn_software/java/sqldeveloper/sqlcl-25.2.0.184.2054.zip
Documentation: https://docs.oracle.com/en/database/oracle/sql-developer-command-line/25.2/sqcug/using-oracle-sqlcl-mcp-server.html
Example MCP Clients: https://github.com/punkpeye/awesome-mcp-clients
Example MCP Servers: https://github.com/punkpeye/awesome-mcp-servers
AI for code tooling: https://github.com/oracle-samples/ml4code/tree/main/08_tooling
A blog on the subject by Hamza Eraoui: SQLcl MCP Server: Tips, Tricks, and Use Cases | by Hamza Eraoui | Jul, 2025 | Medium
Jeff Smith himself:
And an intro by Paulo Portugal: Oracle SQLcl MCP: Bringing AI to SQL with Model Context Protocol
Eric Avidon on Techtarget: Oracle adds MCP support to advance agentic AI development
and what looks like oracle internal ref. https://ml4code.oraclecorp.com/libs/#mcp