I often hear asserted in discussions about accessing data in databases that the approach taken should be “database independent”. The assertion is always made as if it were some sort of Scientific Truth (Dogma even). And it is always assumed that this is easy to achieve. Finally, it always seems to be an accepted reality as if it was backed by some sort of consensus in the Computer Science Literature.
(I have almost always heard this when discussing non-Microsoft databases and technologies but rarely when the Microsoft technologies are the principle components of a solution. But, perhaps that is just a selective reading based on my own career!)
This article will provide a simple comparison based on generating “vectors” from the rings of a multi-polygon using Oracle’s ST_Geometry, SQL Server 2008’s STGeometry and PostgreSQL/PostGIS’s ST_Geometry.
Initial Comments on Hierarchy
In my comparison work I use the OGCSFS 1.1 standard with preference for the SQL/MM standard. Since all three databases are “Object” relational databases (ORDBMS) we need to start with a comment about this aspect of the implementation.
Oracle’s ST_Geometry is a true type hierarchy with only those functions defined on specific elements available for geometries of that type. So, for example, you cannot call ST_Area on an ST_CURVE (ST_LINESTRING, ST_MULTILINESTRING etc), whereas in SQL Server 2008 and PostGIS you can.
The availability of a true object hierarchy was not always the case. Up to 10g Oracle only had a singly-inherited type library. The original SDO_GEOMETRY approach is still singly inherited: only the SQLMM/OGC implementation from 10g onwards is not.
In the above SQL you will notice a lack of an ST_LineFromText() constructor that returns an ST_LineString object: this is not available in the ST_CURVE or ST_LINESTRING sub-class objects. Given the use of ST_Geometry one would have thought an ST_GeomFromText() method would be available rather than the non-standard FROM_WKT: nope, sorry, Oracle has not implemented one. The only ST*FromText() method available are:
SELECT object_name ||'.'|| procedure_name AS conversion_method
WHERE owner ='MDSYS'
AND object_name LIKE'ST%'
AND procedure_name LIKE'ST%TEXT';
Finally, with Oracle, some constructors or accessors return ST_Geometry objects even if they are ST_LineStrings etc. To access the methods of the underlying sub-class object, we need to “cast” them to the required type via the Oracle TREAT function. eg SELECTTREAT (geom as ST_LineString) as line FROM …..
Because PostGIS is "singly inherited", it does not use a "dot" object.method invocation from a base class. Thus ST_LineFromText() function is an independent function that returns a geometry (ST_Geometry): No casting is required.
*SQL Server 2008*
SQL Server 2008 is a “singly inherited” type system that uses a “dot” object.method invocation from a base class in object creation or method execution. Thus ST_STLineFromText() function requires the use of “geometry::” that acts as a surrogate for a missing ST_Geometry abstract class.
The following three SQL statements are the result of trying to minimise the differences in SQL between the three databases. Where possible, I avoided use of anything that is specific to a database (eg rownum in Oracle). In addition, I use my SQL Server and Oracle versions of PostgreSQL’s generate_series() set returning function.
-- MultiPoly with two elements (polygons) one with an Inner Ring
WITH poly AS(
SELECTROW_NUMBER()OVER(PARTITION BY b.n ORDERBY b.n)AS polyNo,
FROM poly p
FROM poly p,
(SELECT generate_series(1,ST_NumInteriorRing(p.geom),1)AS ringNo FROM poly p)AS r
(SELECT1AS n) b
ORDERBY a.polyNo, a.ringNo,3;
polyNo ringNo vectorNo sx sy ex ey
------ ------ -------- --- --- --- ---
Constructor function name MultiPolygonFromText different from other databases;
STNumInteriorRing reports only interior rings.
No object.method notation;
ST_GeometryType returns “ST_Polygon” which is upper cased for cross-database comparison purposes;
set returning function must be implemented as an attribute in a select statement in order to be able to use geometry methods.
Set Returning Functions
PostgreSQL’s inability for set returning functions eg generate_series() to draw their values/parameters from another table’s columns in the SQL is one of the major issues in trying to create cross-database SQL.
Here are some examples that show the issue. The first two show how generate_series can be used reasonably “naturally” with the value from the first table “a” being fed into the generate_series of “b”. The only difference is SQL Server’s keywords “CROSSAPPLY” vs Oracle’s “, TABLE”.
ERROR: subquery INFROM cannot refer TO other relations OF same query level
LINE 3: (SELECT generate_series(1,a.maxValue,1)AS column_val...
-- PostgreSQL - Attempt 3
WITH aTable AS(
SELECT b.*, a.maxValue
FROM aTable a,
(SELECT generate_series(1,a.maxValue,1)AS column_value FROM aTable a) b;
Finally, we get a result we can work with with PostgreSQL. However, to use this as the template for “cross database” SQL coding is, I think, pretty unacceptable. It is ugly, wordy and contorted, leading to semantic interpretation problems, never mind the potential for voluminous SQL (with multiple CTEs).
There is a better method in PostgreSQL which is as follows.
However, this use of a table function as an attribute is not possible in SQL Server and Oracle, thus it cannot be a basis for a cross-platform SQL “template”.
I am sure there are better methods. Anyone?
Here is a table comparing the OGC/SQLMM Methods Used:
Standard (SQL/MM ie ISO TC211)
SQL Server 2008
In short, the OGC/SQLMM side of the database access problem is reasonably straight-forward (if anything ever is). Sure, there are naming differences but they can overcome. The lack of a true object hierarchy does provide some interesting implementation issues but these, I think, can be encapsulated via WITH or CTE expressions.
But, on the whole “database (or any technology) independence” is not easy to achieve for a whole host of non-spatial reasons.
1. Implementations of the relevant SQL standards (eg SQL99 Analytics) differ across all databases;
2. Non-standard database-specific functionality exists eg ROWNUM/DUAL/TABLE function (Oracle) etc;
3. Lack of “common” methods across all database products for handling collections eg PostgreSQL has no similar “CROSSAPPLY”/TABLE function handling capability (can do so via methods that the other databases do not provide);
4. OGC/SQLMM implementation varies across databases with with methods missing (eg ST_GeometryN in Oracle), and limited or no inheritance;
5. Interpretation of OGC/SQLMM standards differ eg SQL Server 2008 + PostGIS has geography datatype which the standard does not mention or require.
6. API keywords vary;
7. Database architectures differ.
Well, this is one attempt at looking at the cross-database. I have not discussed other architectural/philosophical design patterns like Oracle’s “Writers never block readers”, database ROI etc. That stuff is well covered elsewhere eg Tom Kyte’s views are well represented in this article by Mark Rittman called The Cost Of Database Independence and Comparison of different SQL implementations.
I have probably made some errors as I don’t profess to being a PostgreSQL expert (just an experienced user). If anyone else wishes to contribute to this discussion please do via some constructive comments.