Skip to content

CartoDB Oracle FDW

Jorge Sanz edited this page Jul 9, 2015 · 4 revisions

About this document

The objective of this document is to give a high level view of the status of the Oracle connector for CartoDB as well as the technical details on how to install it and set it up have a working environment of Oracle tables exposed as tables that can be read by CartoDB.

Most technical contents of this page have been extracted from the CartoDB onpremise FDW.

Overview

Exposing information stored on clients databases is a typical request. Oracle is one of the most used databases and having a working procedure to connect CartoDB with an Oracle database is needed.

The use case where the Oracle FDW works best with CartoDB is when joining a CartoDB to an Oracle table that stores additional alphanumeric tables.

Other use cases like showing Oracle Spatial geometries are not that good because:

  • The FDW doesn't expose to PostGIS the geometries and the cartographic projection is not recognized

  • The FDW doesn't pass to Oracle the operations done to the geometry so for example a query that tries to gets the geometries by a bounding box will don't convert the ST_Intersects function to the equivalent SDO_RELATE in Oracle so it will retrieve the full dataset and do the operation locally. On the other hand this works for alphanumeric values with some caveats discussed earlier.

Resources

The Oracle FDW is documented and hosted at:

The maintainer of the project is active and accessible for questions and fixing bugs.

Installation

Requirements:

  1. Support for running make (build-essentialpackage, etc)
  2. Postgres development libraries
  3. Oracle Instant Client (follow instructions at ubuntu website including the SDK Fix section)

The easiest way to install the extension is to download the GitHub repository and build it locally with the typical make; make install commands. Once the Oracle FDW shared library is accessible by Postgres it should accept to load the extension following next section instructions.

Before trying to connect using the extension, is advisable to try to connect to the remote Oracle instance using the sqlplus tool so you can be sure everything is working on the Oracle side. To access an Oracle server using this tool you need to run

sqlplus 'user/pass@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=hostname.network)(Port=1521))(CONNECT_DATA=(SID=remote_SID)))'

There are other ways to connect to an Oracle host, one is using easy connect and the other is using a tsnames.ora definition file.

Usage

Usage example:

CREATE EXTENSION oracle_fdw;

CREATE SERVER oradb 
    FOREIGN DATA WRAPPER oracle_fdw
    OPTIONS (
        dbserver '//dbserver.mydomain.com/ORADB'
    );

GRANT USAGE ON FOREIGN SERVER oradb TO development_cartodb_user_b7861beb-21d4-46bc-ae00-5f09f77701a6;

-- Then you can connect to PostgreSQL as 
-- "development_cartodb_user_b7861beb-21d4-46bc-ae00-5f09f77701a6":

CREATE USER MAPPING FOR development_cartodb_user_b7861beb-21d4-46bc-ae00-5f09f77701a6 
    SERVER oradb
    OPTIONS (user 'myremoteuser', password 'mypwd');

CREATE FOREIGN TABLE oratab (
          id        integer OPTIONS (key 'true')  NOT NULL, -- "key" option flags primary key
          text      character varying(30),
          start_time    timestamp,
          floating  double precision  NOT NULL
       ) 
    SERVER oradb 
    OPTIONS (
        schema 'MYREMOTEUSER', 
        table 'ORATAB'
     );

Surprises

  • Oracle has two time types: timestamp and date. Timestamps are accurate to fractions of second; dates are accurate to seconds. This is different from PostgreSQL, which uses timestamps that are very accurate and dates that are just calendar dates, to the nearest day. If you have a date column in Oracle with an index on it, and you attempt to access it via the FDW, the index will not come into play, but the Oracle FDW maps all PostgreSQL time types into timestamps, not dates. To get dates through the FDW, you have to avoid the FDW type mapper, by using the to_date() function (this works because the FDW function mapper recognizes to_date() as a function it can pass through to Oracle untouched).
-- Don't use this
SELECT * FROM oratab WHERE timestamp < '2015-05-05 01:02:32';
-- Use this instead
SELECT * FROM oratab WHERE timestamp < to_date('2015-05-05 01:02:32', 'YYYY-MM-DD HH24:MI:SS');
  • Because the Oracle Instant Client comes with its own LDAP implementation, it will clash with OpenLDAP library if Postgres is configured to use LDAP. Check the LDAP Libraries section of the README file.

Spatial support

geometry_columns

Oracle spatial data type (SDO_GEOMETRY) is supported, but the geometry_columns metadata view will catalogue foreign tables as SRID=0 no matter they are correctly set up at Oracle using their own SRID cataloguing numbers or using EPSG table.

So querying the Oracle metadata table gives the correct information:

SQL> select table_name, srid from user_sdo_geom_metadata where table_name like 'POINTS';

TABLE_NAME			       SRID
-------------------------------- ----------
POINTS				       4326

But querying the catalogue in the same way geometry_column view is configured gives a 0 value:

testoracle=# SELECT c.relname::character varying(256) AS f_table_name,
    postgis_typmod_srid(a.atttypmod) as srid  
FROM pg_class c,  pg_attribute a, pg_type t, pg_namespace n
WHERE t.typname = 'geometry'::name AND a.attisdropped = false AND a.atttypid = t.oid AND a.attrelid = c.oid AND c.relnamespace = n.oid AND (c.relkind = 'r'::"char" OR c.relkind = 'v'::"char" 
OR c.relkind = 'm'::"char" OR c.relkind = 'f'::"char") AND NOT pg_is_other_temp_schema(c.relnamespace) 
AND NOT (n.nspname = 'public'::name AND c.relname = 'raster_columns'::name) AND 
has_table_privilege(c.oid, 'SELECT'::text) and c.relname like 'orapoints';
 f_table_name | srid 
--------------+------
 orapoints    |    0
(1 row)

To fix this issue it's possible to define a view that wraps the foreign data table specifying the type and SRID using Postgres typmod notation. So in this case creating the view

create view vorapoints as select id, description, cat, geometry::geometry(Point,4326) from orapoints ;

is then correctly catalogued:

testoracle=# select f_table_name, srid from geometry_columns where f_table_name = 'vorapoints';
 f_table_name | srid
--------------+------
 vorapoints   | 4326
(1 row)

Spatial functions on geometry type

Although the format is supported, functions are not mapped to Oracle so Postgres will end up doing a full select over the table in order to apply any function on the geometry column. So if for example we have a world borders table and we want to count the countries that intersect a bounding box we will have something like this:

database=# explain select name from world where geom && ST_MakeEnvelop(20,20,40,40,4326);

QUERY PLAN 
---------------------------------------------------------------------------------------------------------
 Foreign Scan on world  (cost=10000.00..10000.00 rows=1000 width=178)
   Filter: (geom && '0103000020E610000001000000050000000000000000003440000000000000344000000000000034400000000000004440000000000000444000000000000044400000000000004440000000000000344000000000000034400000000000003440'::geometry)
   Oracle query: SELECT /*9f30fa765ded1ad160dcd1b55f9e6032*/ "NAME", "GEOM" FROM "CDBTEST"."WORLD_BORDERS"
(3 rows)

on the other hand if we do the filter by a numeric column then the filter is well passed to Oracle:

database=# explain select name from world where pop2005 < 1e7;

QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------
 Foreign Scan on world  (cost=10000.00..10000.00 rows=1000 width=178)
   Oracle query: SELECT /*c1f09bfc9f6402f6f945d6b9e8da1ff3*/ "NAME", "POP2005" FROM "CDBTEST"."WORLD_BORDERS" WHERE ("POP2005" < 10000000)
(2 rows)

CartoDB specifics

Cache invalidation

To invalidate caches on CartoDB platform Postgres tables are watched and some metadata is stored to fire the invalidation process. Because Oracle changes are not exposed to Postgres the cache is not automatically invalidated when changes happen on the Oracle side.

Workarounds to this can be implemented depending on the specifics of every project, like:

  • adding a changing parameter on requests to bypass the cache
  • creating a crontab that fires a fake update on the Postgres table with the desired periodicity
  • replicating the tracking changes metadata table on the Oracle side and check it frequently to fire the fake updates only on those tables that have changes.
Clone this wiki locally