Go to content Go to navigation and search

Home

Current PostGIS Blog Articles


Search

Browse

RSS / Atom

Email me

textpattern

Creative Commons License
All Blog Articles, Data Models and Free Source Code by Simon Greener, The SpatialDB Advisor is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

How to apply spatial constraints to PostGIS tables

Thursday May 14 2009 at 06:54

KeywordsSpatial Constraints PostGIS DDL
Summary

Spatial Constraints PostGis Ddl

This article was written before TypeMod was introduced eg Geometry(Polygon,28355)

As I have pointed out in other blog articles, spatial data quality should not engender either/or solutions when building business applications.
That is, if I can only create points for parcel centroids that fall within land parcels, then I don’t just build the rule in to the editing software that implements the end-user edit capabilities of the system: I should also add such constraints to the database that is holding the application data model and data.
The ability of databases in the area of spatial data quality is variable: this is because of the comparative youthful age of current spatial object relational databases.

One of the benefits of the mathematics behind relational theory is that, where the prescriptions of the model are implemented and used, data quality is ensured.
We have many relational and object-relational databases available today (both commercially in Oracle, SQL Server etc and open source, MySQL, PostgreSQL etc) but these databases all implement relational theory (or the SQL standards that have taken over from the pure science) to lesser or greater extent.
A complaint of C.J. Date is that many of todays databases that purport to be relational are not; he even argues persuasively that SQL is not relational.

Prosaically, the sorts of constraints in relational databases that are of use in ensuring data quality are:

  1. Primary/Unique Constraints;
  2. Not Null
  3. Foreign Key Constraints
  4. Column Check Constraints
  5. Table Check Constraints.

In the SQL-92 standard there is the little know ASSERTION (constraint) that, as far as I know, no commercial database today implements.

While all these constraints are based on attribute data and not complex data types such as spatial data, only some can be used to control spatial data quality.
These are the Column and Table versions of the CHECK constraint.

PostGIS’s AddGeometryColumn() Function

PostGIS can ensure spatial data quality through both an ordinary and advanced use of the CHECK constraint.

One of the nice things about registering a table/column using PostGIS’s AddGeometryColumn() function is that it automatically adds in CHECK constraints that test that the entered geometry:

  1. Is of the right type ie POLYGON and not POINT
  2. Has right SRID (if supplied)
  3. Has right dimensionality

The following statements show how this is done.

  1. -- 0 Start with a clean data model
  2. DROP   TABLE simon.parcel_centroid;
  3. Query returned successfully WITH no RESULT IN 16 ms.
  4. --
  5. DROP   TABLE simon.parcel;
  6. Query returned successfully WITH no RESULT IN 16 ms.
  7. --
  8. -- 1 Create bare bones table without geometry column
  9. CREATE TABLE simon.parcel
  10. (
  11.   gid serial NOT NULL PRIMARY KEY
  12. )
  13. WITH (
  14.   OIDS=FALSE
  15. );
  16. NOTICE:  CREATE TABLE will CREATE implicit SEQUENCE "parcel_gid_seq" FOR serial COLUMN "parcel.gid"
  17. NOTICE:  CREATE TABLE / PRIMARY KEY will CREATE implicit INDEX "parcel_pkey" FOR TABLE "parcel"
  18. Query returned successfully WITH no RESULT IN 109 ms.
  19. --
  20. /* Table looks like this:
  21. **
  22. CREATE TABLE simon.parcel
  23. (
  24.   gid serial NOT NULL,
  25.   CONSTRAINT parcel_pkey PRIMARY KEY (gid)
  26. )
  27. WITH (
  28.   OIDS=FALSE
  29. );
  30. */
  31. --
  32. -- Grant ownership
  33. ALTER TABLE simon.parcel OWNER TO postgres;
  34. Query returned successfully WITH no RESULT IN 16 ms.
  35. --
  36. -- 2. Add geometry column and CHECK constraints on geometry
  37. SELECT addGeometryColumn('simon','parcel','geom','28355','POLYGON','2');

addgeometry
text
simon.parcel.geom SRID:28355 TYPE:POLYGON DIMS:2

  1. /* Note that the table now has the following structure.
  2. CREATE TABLE simon.parcel
  3. (
  4.   gid  serial NOT NULL,
  5.   geom geometry,
  6.   CONSTRAINT parcel_pkey PRIMARY KEY (gid),
  7.   CONSTRAINT enforce_dims_geom    CHECK (st_ndims(geom) = 2),
  8.   CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POLYGON'::text OR geom IS NULL),
  9.   CONSTRAINT enforce_srid_geom    CHECK (st_srid(geom) = 28355)
  10. )
  11. */

There is something important to note about these simple CHECK constraints: they use functions to extract a value from the geometry object that can be tested in a first order predicate expression that returns true or false: “function() == value”.

We can now conduct some tests on the table created above to show how these CHECK constraints work.

  1. -- 3. try and insert a POLYGON with wrong dimensionality
  2. INSERT INTO simon.parcel(gid,geom) VALUES (1,ST_GeomFromEWKT('POLYGON ((100 0 -9,120 0 -9,120 20 -9,100 20 -9,100 0 -9))'));
  3. ERROR:  NEW ROW FOR relation "parcel" violates CHECK CONSTRAINT "enforce_dims_geom"
  4. --
  5. -- 4. Try and insert POLYGON with right dimensionality
  6. INSERT INTO simon.parcel(gid,geom) VALUES (1,ST_PolygonFromText('POLYGON ((100 0,120 0,120 20,100 20,100 0))'));
  7. ERROR:  NEW ROW FOR relation "parcel" violates CHECK CONSTRAINT "enforce_srid_geom"
  8. --
  9. -- 5. Try and insert geometry with right SRID
  10. INSERT INTO simon.parcel(gid,geom) VALUES (1,ST_PolygonFromText('POLYGON ((100 0,120 0,120 20,100 20,100 0))',28355));
  11.  Query returned successfully: 1 ROWS affected, 16 ms execution TIME.
  12. --
  13. -- 6. Insert Another POLYGON
  14. INSERT INTO simon.parcel(gid,geom) VALUES (2,ST_PolygonFromText('POLYGON ((0 0,20 0,20 20,0 20,0 0),(10 10,10 11,11 11,11 10,10 10),(5 5,5 7,7 7,7 5,5 5))',28355));
  15. Query returned successfully: 1 ROWS affected, 16 ms execution TIME.
  16. --
  17. -- 7. Now try and insert a MULTIPOLYGON
  18. INSERT INTO simon.parcel(gid,geom) VALUES (3,ST_MultiPolygonFromText('MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0),(10 10,10 11,11 11,11 10,10 10),(5 5,5 7,7 7,7 5,5 5)),((50 5,50 7,70 7,70 5,50 5)))',28355));
  19. ERROR:  NEW ROW FOR relation "parcel" violates CHECK CONSTRAINT "enforce_geotype_geom"
  20. --
  21. -- How do we fix this if we want both POLYGON and MULTIPOLYGONS in our table?
  22. -- We could do this back at the original call to AddGeometryColumn but here we will show how to do it post-factum.
  23. --
  24. -- 8. Modify the constraint directly
  25. ALTER TABLE simon.parcel DROP CONSTRAINT enforce_geotype_geom;
  26. Query returned successfully WITH no RESULT IN 15 ms.
  27. --
  28. ALTER TABLE simon.parcel
  29.   ADD CONSTRAINT enforce_geotype_geom
  30.   CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL);
  31. Query returned successfully WITH no RESULT IN 31 ms.
  32. --
  33. -- 9. Try again
  34. INSERT INTO simon.parcel(gid,geom) VALUES (3,ST_MultiPolygonFromText('MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5)), ((50 5,50 7,70 7,70 5,50 5)))',28355));
  35. Query returned successfully: 1 ROWS affected, 32 ms execution TIME.
  36. --
  37. SELECT gid, ST_AsText(geom)
  38.   FROM simon.parcel;

gid
integer
st_astext
text
1 POLYGON ((100 0,120 0,120 20,100 20,100 0))”
2 POLYGON ((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5))”
3 MULTIPOLYGON (((0 0,20 0,20 20,0 20,0 0), (10 10,10 11,11 11,11 10,10 10), (5 5,5 7,7 7,7 5,5 5)), ((50 5,50 7,70 7,70 5,50 5)))”

Parcel Centroid Points within Parcel Polygons

Let’s add a new rule that a land parcel has to have a (single) centroid point that falls within the polygon representing the parcel. We can implement this in a number of ways.

Two Column Table

Firstly, let’s start by adding a new column to our existing table. Tables with multiple geometric columns are not common mainly because GIS packages have been unable to deal with them: though some nowadays do. However, in this situation we have a single centroid point with no other attributes describing the land parcel so it makes sense to add it to the parcel table. Note: If your GIS can’t cope with a multi-geometry columned table try using views for both visualisation and update.

  1. SELECT addGeometryColumn('simon','parcel','centroid','28355','POINT','2');

addgeometry
text
simon.parcel.centroid SRID:28355 TYPE:POINT DIMS:2

Our table now looks like this.

  1. CREATE TABLE simon.parcel
  2. (
  3.   gid      serial NOT NULL,
  4.   geom     geometry,
  5.   centroid geometry,
  6.   CONSTRAINT parcel_pkey              PRIMARY KEY (gid),
  7.   CONSTRAINT enforce_dims_centroid    CHECK (st_ndims(centroid) = 2),
  8.   CONSTRAINT enforce_dims_geom        CHECK (st_ndims(geom) = 2),
  9.   CONSTRAINT enforce_geotype_centroid CHECK (geometrytype(centroid) = 'POINT'::text OR centroid IS NULL),
  10.   CONSTRAINT enforce_geotype_geom     CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL),
  11.   CONSTRAINT enforce_srid_centroid    CHECK (st_srid(centroid) = 28355),
  12.   CONSTRAINT enforce_srid_geom        CHECK (st_srid(geom) = 28355)
  13. )
  14. WITH (
  15.   OIDS=FALSE
  16. );

How can we implement a spatial constraint to ensure that any point added/updated is checked to ensure it falls within the host polygon? Well, we can use the ST_Covers() function directly in the check constraint. However, we cannot apply this constraint as we have rows in the table with no centroids and PostgreSQL does not have a NOVALIDATE (as Oracle does – this means no not apply the constraint to existing records only inserts/updates from now on) option when adding a constraint.

  1. -- 1 Add centroids to existing polygons
  2. UPDATE simon.parcel SET centroid = ST_Centroid(geom);
  3. Query returned successfully: 3 ROWS affected, 62 ms execution TIME.
  4. --
  5. -- 2. Now, apply centroid constraint
  6. ALTER TABLE simon.parcel
  7.   ADD CONSTRAINT centroid_in_parcel
  8.   CHECK (centroid IS NOT NULL AND ST_Covers(geom,centroid) = TRUE);
  9. ERROR:  CHECK CONSTRAINT "centroid_in_parcel" IS violated BY SOME ROW
  10. --
  11. -- 3. Find which rows fail
  12. SELECT gid, ST_Covers(geom,centroid)
  13.   FROM simon.parcel;

gid
integer
st_covers
boolean
1 t
2 f
3 t

  1. -- 4 Fix it
  2. UPDATE simon.parcel
  3.    SET centroid = ST_PointOnSurface(geom)
  4.  WHERE gid = 2;
  5. Query returned successfully: 1 ROWS affected, 63 ms execution TIME.
  6. --
  7. -- 5. Check
  8. SELECT gid, ST_Covers(geom,centroid)
  9.   FROM simon.parcel
  10.  WHERE gid = 2;

gid
integer
st_covers
boolean
2 t

  1. -- 6. Now we can apply the constraint
  2. ALTER TABLE simon.parcel
  3.   ADD CONSTRAINT centroid_in_parcel
  4.   CHECK (centroid IS NOT NULL AND ST_Covers(geom,centroid) = TRUE);
  5. Query returned successfully WITH no RESULT IN 16 ms.
  6. --
  7. /* Table now looks like this
  8. CREATE TABLE simon.parcel
  9. (
  10.   gid      serial NOT NULL,
  11.   geom     geometry,
  12.   centroid geometry,
  13.   CONSTRAINT parcel_pkey              PRIMARY KEY (gid),
  14.   CONSTRAINT centroid_in_parcel       CHECK (centroid IS NOT NULL AND st_covers(geom, centroid) = true),
  15.   CONSTRAINT enforce_dims_centroid    CHECK (st_ndims(centroid) = 2),
  16.   CONSTRAINT enforce_dims_geom        CHECK (st_ndims(geom) = 2),
  17.   CONSTRAINT enforce_geotype_centroid CHECK (geometrytype(centroid) = 'POINT'::text OR centroid IS NULL),
  18.   CONSTRAINT enforce_geotype_geom     CHECK ((geometrytype(geom) = ANY (ARRAY['MULTIPOLYGON'::text, 'POLYGON'::text])) OR geom IS NULL),
  19.   CONSTRAINT enforce_srid_centroid    CHECK (st_srid(centroid) = 28355),
  20.   CONSTRAINT enforce_srid_geom        CHECK (st_srid(geom) = 28355)
  21. )
  22. WITH (
  23.   OIDS=FALSE
  24. );
  25. */

Two Table Solution with Foreign Key

It is not common to see the above modelling. Mostly one sees the parcel polygons in one table with another table being used to hold the parcel centroids. We can still do our centroid_in_parcel check constraint, however, but there is a little bit more involved.

  1. -- 1. Create a table to hold the centroid
  2. DROP   TABLE simon.parcel_centroid;
  3. Query returned successfully WITH no RESULT IN 47 ms.
  4. --
  5. CREATE TABLE simon.parcel_centroid
  6. (
  7.   gid        serial NOT NULL PRIMARY KEY,
  8.   gid_parcel INTEGER NOT NULL,
  9.   CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel) REFERENCES simon.parcel(gid)
  10. )
  11. WITH (
  12.   OIDS=FALSE
  13. );
  14. NOTICE:  CREATE TABLE will CREATE implicit SEQUENCE "parcel_centroid_gid_seq" FOR serial COLUMN "parcel_centroid.gid"
  15. NOTICE:  CREATE TABLE / PRIMARY KEY will CREATE implicit INDEX "parcel_centroid_pkey" FOR TABLE "parcel_centroid"
  16. Query returned successfully WITH no RESULT IN 156 ms.
  17. --
  18. /* Table looks like this
  19. **
  20. CREATE TABLE simon.parcel_centroid
  21. (
  22.   gid        serial  NOT NULL,
  23.   gid_parcel integer NOT NULL,
  24.   geom       geometry,
  25.   CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
  26.   CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
  27.              REFERENCES simon.parcel (gid) MATCH SIMPLE
  28.              ON UPDATE NO ACTION ON DELETE NO ACTION
  29. )
  30. WITH (
  31.   OIDS=FALSE
  32. );
  33. */
  34. --
  35. -- Change ownership
  36. ALTER TABLE simon.parcel_centroid OWNER TO postgres;
  37. Query returned successfully WITH no RESULT IN 16 ms.
  38. --
  39. -- 2. Add geometry column and CHECK constraints on geometry
  40. SELECT addGeometryColumn('simon','parcel_centroid','geom','28355','POINT','2');

addgeometry
text
simon.parcel_centroid.geom SRID:28355 TYPE:POINT DIMS:2

Our parcel_centroid table now looks like this:

  1. CREATE TABLE simon.parcel_centroid
  2. (
  3.   gid        serial NOT NULL,
  4.   gid_parcel INTEGER NOT NULL,
  5.   geom       geometry,
  6.   CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
  7.   CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
  8.              REFERENCES simon.parcel (gid) MATCH SIMPLE
  9.              ON UPDATE NO ACTION ON DELETE NO ACTION,
  10.   CONSTRAINT enforce_dims_geom    CHECK (st_ndims(geom) = 2),
  11.   CONSTRAINT enforce_geotype_geom CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL),
  12.   CONSTRAINT enforce_srid_geom    CHECK (st_srid(geom) = 28355)
  13. )
  14. WITH (
  15.   OIDS=FALSE
  16. );
  17. */
  18. --
  19. -- 3. Now, insert data via a query against the parcel table
  20. INSERT INTO simon.parcel_centroid(gid_parcel, geom)
  21. SELECT gid,ST_Centroid(geom)
  22.   FROM simon.parcel;
  23. Query returned successfully: 3 ROWS affected, 16 ms execution TIME.

The only way we can check that these centroids fall inside the geometry polygons in the related, parcel, table is by constructing a function and using it instead of ST_Covers in our CHECK constraint

  1. -- 4. Construction function
  2. CREATE OR REPLACE FUNCTION simon.centroid_in_parcel(geometry)
  3.   RETURNS INTEGER AS
  4. $BODY$
  5. SELECT COUNT(*)::INTEGER FROM simon.parcel p WHERE ST_Covers(p.geom,$1);
  6. $BODY$
  7.   LANGUAGE 'sql' IMMUTABLE
  8.   COST 100;
  9. Query returned successfully WITH no RESULT IN 16 ms.
  10. --
  11. ALTER FUNCTION simon.centroid_in_parcel(geometry) OWNER TO postgres;
  12. Query returned successfully WITH no RESULT IN 16 ms.
  13. --
  14. -- 5. Check for validity before application
  15. SELECT c.gid, ST_Covers(p.geom,c.geom)
  16.   FROM simon.parcel p
  17.        INNER JOIN
  18.        simon.parcel_centroid c
  19.        ON ( c.gid_parcel = p.gid );

gid
integer
st_covers
boolean
1 t
2 t
3 f

  1. -- 6. Fix gid = 3
  2. UPDATE simon.parcel_centroid c SET geom = (SELECT ST_PointOnSurface(geom) FROM simon.parcel p WHERE p.gid = c.gid_parcel) WHERE gid = 3;
  3. Query returned successfully: 1 ROWS affected, 15 ms execution TIME.
  4. --
  5. -- 7. Check again for validity before application
  6. SELECT c.gid, ST_Covers(p.geom,c.geom)
  7.   FROM simon.parcel p
  8.        INNER JOIN
  9.        simon.parcel_centroid c
  10.        ON ( c.gid_parcel = p.gid )
  11.  WHERE c.gid = 3;

gid
integer
st_covers
boolean
3 t

  1. -- 8. Now apply constraint
  2. ALTER TABLE simon.parcel_centroid
  3.   ADD CONSTRAINT centroid_in_parcel_ck
  4. CHECK (simon.centroid_in_parcel(geom) > 0);
  5. Query returned successfully WITH no RESULT IN 16 ms.
  6. --
  7. /* Table looks like this
  8. **
  9. CREATE TABLE simon.parcel_centroid
  10. (
  11.   gid        serial NOT NULL,
  12.   gid_parcel integer NOT NULL,
  13.   geom        geometry,
  14.   CONSTRAINT parcel_centroid_pkey PRIMARY KEY (gid),
  15.   CONSTRAINT parcel_gid_fk FOREIGN KEY (gid_parcel)
  16.              REFERENCES simon.parcel (gid) MATCH SIMPLE
  17.              ON UPDATE NO ACTION ON DELETE NO ACTION,
  18.   CONSTRAINT centroid_in_parcel_ck CHECK (simon.centroid_in_parcel(geom) > 0),
  19.   CONSTRAINT enforce_dims_geom     CHECK (st_ndims(geom) = 2),
  20.   CONSTRAINT enforce_geotype_geom  CHECK (geometrytype(geom) = 'POINT'::text OR geom IS NULL),
  21.   CONSTRAINT enforce_srid_geom     CHECK (st_srid(geom) = 28355)
  22. )
  23. WITH (
  24.   OIDS=FALSE
  25. );
  26. */
  27. --
  28. -- 9. Now test with an invalid centroid
  29. UPDATE simon.parcel_centroid c
  30.    SET geom = (SELECT ST_Centroid(geom)
  31.                  FROM simon.parcel p
  32.                 WHERE p.gid = c.gid_parcel)
  33.  WHERE gid = 3;
  34. ERROR:  NEW ROW FOR relation "parcel_centroid" violates CHECK CONSTRAINT "centroid_in_parcel_ck"

Excellent it works exactly as we wish: no-one can now write a centroid point that falls outside its land parcel!

Because PostgreSQL allows functions to be called from within a table’s CHECK constraints, powerful spatial data quality constraints can be added to a data model with spatial data embedded with it.

I hope this little article is of use to someone.

Creative Commons License

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Redditpost this at Farkpost this at Yahoo! my webpost this at Windows Livepost this at Google Bookmarkspost this to Twitter

Comment [2]

great job Simon! I wonder if it is possible to use two different geometry tables and test whether a parcel is inside the Town boundary? What do you think?

— art lembo · 20 December 2009, 03:24 · #

Art,

Can you give me an example of two tables with some simple sample geometries?

regards
Simon

Simon · 20 December 2009, 06:49 · #