Tuesday, June 4, 2024

Oracle Spatial - Millions of Points in Thousands of Polygons

I wrote about Oracle Spatial in the past. (Intro, data loading Points, Lines, Polygons). Now I had a new challenge.

We had a POC. Customer wanted to test performance of 7 million data points in 3200 polygons. In the POC I learned there is difference in performance, based on the way we create the query. I was lucky to gain some guidance from Daniel Geringer (who wrote this guide) of Oracle Spatial. Bellow is the summary of the process. It was done on Autonomous Database (ADW), but should be similar on any Oracle DB.

First part will cover the performance benefits from using the proper function. Second, some advisable data setting and cleaning steps we did. 


Part 1 

Customer loaded ~7 million data points (lng/lat) of USA accidents from a similar data source and loaded to ADW. Loaded ~3200 US counties from this or similar source

We could update the longitude and latitude (lng/lat) data and place them in a new spatial format column. For many reasons we preferred not to.  We worked directly with the lng/lat points and the get_geometry function (see page 25 here or slightly longer version here). Next created index (see pages 26, 27 here)... and fixed some inconsistencies (see second part, bellow).

My default intention was to use the SDO_CONTAINS function. Instead of SDO_CONTAINS, Daniel suggested top use  SDO_ANYINTERACT or SDO_INSIDE instead and to switch the order of the geometry parameter to the spatial operator. 

SDO_ANYINTERACT can be used if it's ok to return points on the polygon boundary.  If the requirement is only to return points inside the polygon, SDO_INSIDE should be used. In switching the geometry order, there is a huge difference between running 3200 times comparisons for 7 million points and running 7 million times comparisons for 3200 polygons. In our tests it was ~390 seconds vs. ~5 seconds.

In all cases the following format was used:

select count(*) from accident a, counties c where SDO_... ='TRUE';

Let's compare the following:

  • SDO_CONTAINS(C.GEOMETRY, GET_GEOMETRY(LNG,LAT)) = 'TRUE'
  • SDO_ANYINTERACT(GET_GEOMETRY(LNG,LAT),c.geometry)='TRUE'
  • SDO_ANYINTERACT(c.geometry,GET_GEOMETRY(LNG,LAT))='TRUE'
  • SDO_INSIDE(GET_GEOMETRY(LNG,LAT),c.geometry)='TRUE'

In all cases we had level 8 of parallel execution:

  • SDO_CONTAINS(C.GEOMETRY, GET_GEOMETRY(LNG,LAT)) = 'TRUE' returned in 387.7 sec.
  • SDO_ANYINTERACT(GET_GEOMETRY(LNG,LAT),c.geometry)='TRUE' returned in 4.8 sec.
  • SDO_ANYINTERACT(c.geometry,GET_GEOMETRY(LNG,LAT))='TRUE' returned in 389.2 sec.
  • SDO_INSIDE(GET_GEOMETRY(LNG,LAT),c.geometry)='TRUE' returned in 5.0 sec.

This is 80 times faster!

Both SDO_INSIDE and SDO_CONTAINS return the same number (6,973,650) while both SDO_ANYINERACT return slightly higher number (6,973,846).


So we have 2 winners that can use the better parameters order. 

I will focus on this option:

select count(*) 
from accident a, counties c
where SDO_ANYINTERACT(GET_GEOMETRY(LNG,LAT),c.geometry)='TRUE';


I wanted to see how does the parallelism influence this query performance. I rerun it with various combinations using ADW High connection (bellow 8 ECPU explain plan): 



The results are impressive:



What happens if we increase the amount of data? 

I duplicated the original data 21 times, while moving each data points randomly by 10-100 meters each time towards the center of USA.  I didn't want duplicated locations for the test. 

The result was a table of over 146 million locations. I used similar query and tested it with various amounts of ECPU. Here are the results:







Part 2 

For better performance , before execution, it is advisable to:

  • Validate geometry (fix if needed)
  • Create Index

We validated the Counties table

select name,  sdo_geom.validate_geometry_with_context(GEOMETRY, 0.005) status
from  COUNTIES

and received lots of errors of the type:  13367 [Element <1>] [Ring <1>]

Daniel explained that it is probably due to polygon directionality. It was solved by running the sdo_util.rectify_geometry like this:

update counties set geometry=sdo_util.rectify_geometry(geometry,0.005);

Good code for similar problems is in the blog post by Hans Viehmann here


See also the Spatial Data Validation chapter (slides 3-8) in the  Oracle Spatial Best Practices and Tuning Tips by Daniel Geringer here.


Last part was the index creation. While I covered index creation for  PointsLinesPolygons earlier, I learned there is performance benefit in creating CBTREE organized index for streaming point data. See slides 29, 30 here

Our code was:

INSERT INTO USER_SDO_GEOM_METADATA VALUES (
  'ACCIDENTS',  -- table name
  'ADMIN.GET_GEOMETRY(LNG,LAT)',    -- geometry column name
  SDO_DIM_ARRAY(
    SDO_DIM_ELEMENT('X', -180, 180, 0.005),  -- 'X',[min x],[max x],[tolerance]
    SDO_DIM_ELEMENT('Y', -90, 90, 0.005)),    -- 'Y',[min y],[max y],[tolerance]
   4326           -- indicates longitude/latitude coordinates (coordinate system)
 );


CREATE INDEX ACCIDENTS ON ACCIDENTS_2023_03 ( ADMIN.GET_GEOMETRY(LNG,LAT) )
INDEXTYPE IS MDSYS.SPATIAL_INDEX_V2 PARAMETERS('layer_gtype=point cbtree_index=true');

Please note the INDEXTYPE part that is different from the regular: 
INDEXTYPE IS MDSYS.SPATIAL_INDEX_V2;


P.S. 

One more interesting option to test is the SDO_POINTINPOLYGON function that was described Jean Ihm here and Karin Patange here



Thursday, May 23, 2024

OAS 2024 Recomended Patches

Oracle Analytics Server 2024 (OAS 2024), sometimes name OAS 7.6 is available since March. 

Oracle published Critical Patch Update (CPU) Advisor For Oracle Analytics Server and Oracle Business Intelligence - Updated for April 2024 (Doc ID 2832967.2).

Sorry, tried to create a link, it didn't work.

This list is expected to be updated regularly.

Right now there 2 major parts there: OAS Weblogic & OAS FMw Infrastructure. The first OAS 2024 bundle patch is not available yet.

Under Weblogic we see 

  • Database recommendation for July 2021 PSU or later and Native Network Encryption (NNE) recommendation for the DB.
  • Upgrade Java SE (JDK/JRE) (Patch 18143322 & Note 3012587.1, About Updating the JDK Location After Installing Oracle Analytics Server).
  • Patch 36485713 for Weblogic.

 


Under  FMw Infrastructure we see the following list of patches:

  •  Patch 36402397
  • Patch 36348444+Patch 34809489  or  Patch 36074941+Patch 34809489
  • Patch 36349529  or  Patch 35751917
  • Patch 34542329
  • Patch 36316422  or  Patch 32784652



Few extra remarks:

In OAS 2024 there is a new option of Workbook export to Excel (on Tables and Pivots). To make it work you need to follow Installation Guide and install the NodeJS. 

In OAS 2024 you can use the classic data modeler client for RPD or use the new Semantic Model web interface with similar functionality. If the Semantic Model is missing from the Menu after upgrade from previous versions, you have to grant missing permissions that might have been replaced by the import.  See "After Importing A Snapshot into Oracle Analytics Server (OAS), The Semantic Modeler Feature Is No Longer Available (Doc ID 3011914.1)": Use Grant Permission Assignment and grant 'semanticmodeler.author' permission to your administrator role.


Thank You Itsik Hadar for finding the last 2 bullets.



Tuesday, April 16, 2024

Adding AI - Vision Service, Object Detection to APEX

There is a great site of labs by Oracle - LiveLabs. They are using OCI (Oracle Cloud Infrastructure).  One of the labs, named "Build AI-Powered Image Search into your Oracle APEX App" is combining APEX with  AI - Vision Service and uses the IMAGE_CLASSIFICATION option of the service.

I already wrote a post about replacing the default Image Classification option of the Vision service with Text Detection here (and did a little journey to understand better the Lab.)

This time I needed to use the Object Detection option of Vision Service. 

So I will explain the small changes needed to be done in the original Lab and next talk a bit about the difference between Object Detection and Image Classification.

In the picture there are the various options of AI - Vision Service:


All we have to do are 2 small changes:

In LAB 3, Task 2 - Invoke the OCI Vision REST Data Source through a Page Process, step 10  change to:

Click FEATURE_TYPE and enter the following (we replace the IMAGE_CLASSIFICATION value):

Under Value :

  • Type: Static Value

  • Value: OBJECT_DETECTION

 



In LAB 3, Task 2 - Invoke the OCI Vision REST Data Source through a Page Process, step 14 change the PL/SQL Code to (in the "Parse the Response" part):

UPDATE SM_POSTS
SET
AI_OUTPUT = (
    SELECT
        LISTAGG(obj_name, ',') WITHIN GROUP(
        ORDER BY
            obj_name
        )
    FROM
        JSON_TABLE ( :P1_RESPONSE, '$.imageObjects[*]'
            COLUMNS
                obj_name VARCHAR2 ( 100 ) PATH '$.name[*]'
        )
  )
 WHERE
 ID = :P1_ID;

 

Instead of the original:

UPDATE SM_POSTS
SET
AI_OUTPUT = (
    SELECT
        LISTAGG(obj_name, ',') WITHIN GROUP(
        ORDER BY
            obj_name
        )
    FROM
        JSON_TABLE ( :P1_RESPONSE, '$.labels[*]'
            COLUMNS
                obj_name VARCHAR2 ( 100 ) PATH '$.name[*]'
        )
  )
 WHERE
 ID = :P1_ID;

 

What is the difference?

The original feature of the lab -Image Classification,  Assigns classes and confidence scores based on the scene and contents of an image.

This post option - Object Detection, Identifies objects and their location within an image along with a confidence score.

 

Let's see an example. For the following, very simple picture we will get 2 responses:




Image Classification returns: Human arm,Human face,Human hand,Human head,Human nose

Object Detection returns: Helmet,Human face,Person


Here are the 2 JSON files returned:

Image Classification:

 {
  "labels": [
    {
      "name": "Human head",
      "confidence": 0.9931826
    },
    {
      "name": "Human hand",
      "confidence": 0.99312717
    },
    {
      "name": "Human arm",
      "confidence": 0.9930545
    },
    {
      "name": "Human nose",
      "confidence": 0.99297017
    },
    {
      "name": "Human face",
      "confidence": 0.9926239
    }
  ],
  "ontologyClasses": [
    {
      "name": "Human head",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Human nose",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Human hand",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Human face",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Human arm",
      "parentNames": [],
      "synonymNames": []
    }
  ],
  "imageClassificationModelVersion": "1.5.97",
  "errors": []
}



Object Detection:

{
  "imageObjects": [
    {
      "name": "Helmet",
      "confidence": 0.9530024,
      "boundingPolygon": {
        "normalizedVertices": [
          {
            "x": 0.332,
            "y": 0.1619190404797601
          },
          {
            "x": 0.531,
            "y": 0.1619190404797601
          },
          {
            "x": 0.531,
            "y": 0.39580209895052476
          },
          {
            "x": 0.332,
            "y": 0.39580209895052476
          }
        ]
      }
    },
    {
      "name": "Person",
      "confidence": 0.8978557,
      "boundingPolygon": {
        "normalizedVertices": [
          {
            "x": 0.169,
            "y": 0.17541229385307347
          },
          {
            "x": 0.742,
            "y": 0.17541229385307347
          },
          {
            "x": 0.742,
            "y": 0.9940029985007496
          },
          {
            "x": 0.169,
            "y": 0.9940029985007496
          }
        ]
      }
    },
    {
      "name": "Human face",
      "confidence": 0.8083062,
      "boundingPolygon": {
        "normalizedVertices": [
          {
            "x": 0.363,
            "y": 0.34182908545727136
          },
          {
            "x": 0.507,
            "y": 0.34182908545727136
          },
          {
            "x": 0.507,
            "y": 0.5847076461769115
          },
          {
            "x": 0.363,
            "y": 0.5847076461769115
          }
        ]
      }
    }
  ],
  "ontologyClasses": [
    {
      "name": "Human face",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Helmet",
      "parentNames": [
        "Sports equipment",
        "Clothing"
      ],
      "synonymNames": []
    },
    {
      "name": "Person",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Sports equipment",
      "parentNames": [],
      "synonymNames": []
    },
    {
      "name": "Clothing",
      "parentNames": [],
      "synonymNames": []
    }
  ],
  "objectDetectionModelVersion": "1.3.557",
  "errors": []
}

 

 

Wednesday, April 10, 2024

Oracle Function Exception - No_DATA_FOUND

 I wanted to write a function that accepts 2 parameters. An Item_ID and Manufacturer and returns the support time.

In most cases the first parameter (Item_ID) is sufficient. There are few cases we need the Manufacturer as well.

Unfortunately sometimes customers don't enter the second parameter correctly or I get a NULL there instead of the Manufacturer. In those cases I still want to return a value based on the first parameter (Item_ID) only.

At first I tried using few selects into variables in the function and doing if then else on the result... But it broke each time I didn't have value for the second parameter. I could write a second function with exception and call it from the main function. I preferred this solution, though there are some extreme cases it might not cover.

So I used the Exception option in the function code here:


create or replace function f_SUPPORT (P_ITEM IN NUMBER, P_MANUFACTURER IN VARCHAR2)
RETURN NUMBER is SUP NUMBER;
BEGIN
select Support_mnth into SUP from warranty where item=p_item and MANUFACTURER=P_MANUFACTURER;
return(SUP);
EXCEPTION
   WHEN NO_DATA_FOUND THEN  
select Support_mnth into SUP from warranty where item=p_item;
return(SUP);
end;
/