OBIEE in IL: August 2025

We had a request to save a copy of every dataset created and saved in Oracle Analytics Server (OAS). I will cover only local OAS datasets that are not in the database. Usually, this kind of requests come because of legal demands to save such data on external server.

In theory I can create a script that periodically copies the dataset files to the backup server/FTP.

The datasets files are saved on the server under

Domain_Home/bidata/components/DSS/storage/ssi
Domain_Home/servers/obis1/cache/ xsastorage
Domain_Home/servers/obis1/cache/xsacache

NEVER update or do anything directly on those files. Only deal with copies of the files that are located in your folders.

NEVER change manually RCU database data.

The files I want are in Domain_Home/bidata/components/DSS/storage/ssi. Only files with DSS extension are relevant. The files names are strings that have nothing to do with the dataset name in OAS.

My datasets can come from 2 different sources:

A file (CSV/Text/Excel) loaded as dataset from the UI.
A dataset created by Data Flow.

The loaded based DSS files are usually in the original files format, so renaming the DSS make them usable and in Parquet format.

The Data Flow based DSS files are in CSV format and in Parquet format.

For Parquet files we usually need a special reader utility.

How can I know what those DSS files formats are?

How can I know the real Dataset name?

The secret is in the XXX_BIFOUNDATION schema table S_NQ_DSS_DATASETS.

In that table I see different behavior for loaded files vs. Data Flow based ones.

While the Data Flow based datasets are represented by one row the Loaded datasets are represented by few rows (2 in my case).

Data Flow based dataset

For Data Flow based dataset, the DATASET_NAME column holds the real Dataset name, and the DATAMODEL_ID is NULL. The MIMETYPE column value is "text/csv".

In the FILESTORE_ID column I see the DSS file names. The CSV file and the Parquet file. For example:

{"src":"06a918f7-916f-49de-be33-9c046d53e869","src_parquet":"f76f107c-cb48-4853-a0d7-52839d4f194e"}

Checking the Domain_Home/bidata/components/DSS/storage/ssi I can see 2 DSS files:

06a918f7-916f-49de-be33-9c046d53e869.dss - actually a CSV file
f76f107c-cb48-4853-a0d7-52839d4f194e.dss - actually a Parquet file

Now I have 2 options to handle the files:

Checking the S_NQ_DSS_DATASETS table for updates (LAST_MODIFIED column). Finding the relevant files names in FILESTORE_ID. Copying them with correct filenames and extensions.
Checking the Folder for new files on Linux/Windows level and finding the file name in the S_NQ_DSS_DATASETS table for proper dataset name and extension.

Loaded files, based dataset

For Loaded files, based dataset, we have a dataset row and the real data row in the S_NQ_DSS_DATASETS table. It is very similar to what we see in the UI where we can have several tables or files in a single dataset.

For the entire dataset row the DATASET_NAME column holds the real Dataset name, and the DATAMODEL_ID is NULL. The MIMETYPE column value is NULL. We can disregard the FILESTORE_ID column here. The FILE_LENGTH column value is -1.

For the data row in S_NQ_DSS_DATASETS we can see in the DATAMODEL_ID column, the value of the dataset row DATASET_ID. By joining the 2 we can have the original dataset name for the data row.

In the data row, in the FILESTORE_ID column I see the DSS file names. The CSV file and the Parquet file. For example:

{"src":"8f815cc1-044a-4caa-9026-431be828e347","src_parquet":"261dfba0-9adc-4572-a7d7-9cd4c95dccf5"}

Checking the Domain_Home/bidata/components/DSS/storage/ssi I can see 2 DSS files:

8f815cc1-044a-4caa-9026-431be828e347.dss - actually a CSV or EXCEL or Text file
261dfba0-9adc-4572-a7d7-9cd4c95dccf5.dss - actually a Parquet file

How do I know the proper format? By using the MIMETYPE column where I have values:

application/csv for CSV file.
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet for Excel file.

Now I have 2 options to handle the files:

Checking the S_NQ_DSS_DATASETS table for updates (LAST_MODIFIED column). Finding the relevant files names in FILESTORE_ID .Copying them with correct filenames and extensions.
Checking the Folder for new files on Linux/Windows level and finding the file name in the S_NQ_DSS_DATASETS table for proper dataset name and extension.

I recommend a select of the following type to work with the data, this way you have proper names of DSS files and in FULL Name the name of the dataset:

SELECT

a.dataset_id,

a.stripe_id,

a.dataset_name,

b.dataset_name DS_name,

b.dataset_name||(case when b.dataset_name IS NULL then '' else '-' end)||a.dataset_name FULL_NAME,

a.filestore_id,

substr(a.filestore_id, 9, instr(a.filestore_id, ',')-1-9)||'.dss' Orig_file,

case when substr(a.filestore_id, instr(a.filestore_id, ':', -1)+1,4)='null' then NULL else substr(a.filestore_id, instr(a.filestore_id, ':', -1)+2, length(a.filestore_id)-(instr(a.filestore_id, ':', -1)+2)-1 )||'.dss' end Parquet_file,

a.acl_id,

a.filetype_id,

a.mimetype,

a.owner_id,

a.createdby_id,

a.provider_id,

a.type_options_xml,

a.provider_options_xml,

a.filetype_knobs,

a.file_length,

a.storage_file_length,

a.last_modified,

a.created_date,

a.biservermetadata_last_modified,

a.rawfile_last_modified,

a.is_embryonic,

a.is_user_striped,

a.internal_id,

a.namespace,

a.connection_id,

a.partition_key,

a.refresh_option,

a.schedule_xml,

a.datamodel_id,

a.error_code,

a.error_message,

a.parent_dataset_id,

a.child_dataset_type,

a.app_module_id,

a.properties_last_modified

FROM

dev_biplatform.s_nq_dss_datasets A, dev_biplatform.s_nq_dss_datasets B

where A.file_length<>-1 and a.datamodel_id=b.dataset_id(+)

ORDER BY a.last_modified DESC

;

...Please notice the PARQUET files don't have the actual column names and use names like F0, F1

:You can find the real names in the following 2 columns

TYPE_OPTIONS_XML
FILETYPE_KNOBS

The XML in both columns are similar. I found 3 copies of files column names in different formats.

Here is TYPE_OPTIONS_XML example:

And here is FILETYPE_KNOBS:

Here is a sample of a dataset from TYPE_OPTIONS_XML (only first and last columns - "CUSTOMER_ID" and "date1"):

<ds:datasetImportOptions xmlns:ds="com.oracle.bi/datasetimport/v1">

<ds:tables><ds:table name="Data" totalRowNumber="-1">

<ds:columns><ds:column><ds:name>CUSTOMER_ID</ds:name><ds:datatype>number</ds:datatype></ds:column>

....

<ds:column><ds:name>date1</ds:name><ds:datatype>timestamp</ds:datatype></ds:column>

</ds:columns><ds:delimiter>comma</ds:delimiter><ds:headerRow>true</ds:headerRow></ds:table></ds:tables>

<ds:prepareTables><ds:table name="Data" transformed="false">

<ds:columns><ds:column><ds:name>CUSTOMER_ID</ds:name></ds:column>

....

<ds:column><ds:name>date1</ds:name></ds:column>

</ds:columns></ds:table></ds:prepareTables>

<ds:outputTables><ds:table name="Data">

<ds:columns><ds:column exclude="false">

<ds:name>CUSTOMER_ID</ds:name><ds:display_names><ds:display_name><ds:name>CUSTOMER_ID</ds:name><ds:locale>sys</ds:locale></ds:display_name></ds:display_names><ds:datatype>number</ds:datatype><ds:sourceexpr>"CUSTOMER_ID"</ds:sourceexpr><ds:aggr_rule>none</ds:aggr_rule></ds:column><ds:column exclude="false">

....

<ds:name>date1</ds:name><ds:display_names><ds:display_name><ds:name>date1</ds:name><ds:locale>sys</ds:locale></ds:display_name></ds:display_names><ds:datatype>timestamp</ds:datatype><ds:sourceexpr>"date1"</ds:sourceexpr><ds:aggr_rule>none</ds:aggr_rule></ds:column></ds:columns>

</ds:table></ds:outputTables>

<ds:uiOptions>null</ds:uiOptions>

<ds:version>v5</ds:version>

</ds:datasetImportOptions>

OBIEE in IL

Sunday, August 31, 2025

Best Practices Guide for Tuning Oracle Analytics Server (Doc ID 2866848.2) - LINK

Thursday, August 7, 2025

Hacking OAS dataset files

Data Flow based dataset

Loaded files, based dataset