Discussion:
postgis topology performance
Rémi Cura
2014-03-19 17:43:21 UTC
Permalink
Hey,

We tried several way to load an already topologically correct shapefile
into postgis topology .

10k lines loaded
_Using totopogeom : 750 sec
_Using addlinestring : 246 sec
_Manually using addnode and st_addedgemodface : 86 sec.

Still very slow :-/

Considering manually filling table /and/or using GEOS/or using grass loader.

Cheers,

Rémi-C
Sandro Santilli
2014-03-19 18:08:23 UTC
Permalink
Post by Rémi Cura
Hey,
We tried several way to load an already topologically correct shapefile
into postgis topology .
10k lines loaded
_Using totopogeom : 750 sec
_Using addlinestring : 246 sec
_Manually using addnode and st_addedgemodface : 86 sec.
Still very slow :-/
Considering manually filling table /and/or using GEOS/or using grass loader.
Interesting numbers, thanks.
By "addlinestring" you mean TopoGeo_addLinestring ?
Is this with postgis trunk ?

Note that ST_AddEdgeModFace (called by TopoGeo_addLinestring)
still performs checks to verify the line you add does not
cross othe redges. There's currently no API exposed for simply
adding the linking attributes (face left/right and next edges).

I was thinking that one idea could be to manually insert all
the nodes and edges and then have a function to "polygonize"
the supposedly pre-noded resultance set. This would be pretty
much the same model used by GEOSPolygonize (requires pre-noded
input) and would probably be faster for batch constructions.

To complete the picture, how long does it take to simply insert
all your lines as edges ? So we get a feeling of IO and referential
integrity checks cost...

--strk;
Rémi Cura
2014-03-20 12:39:53 UTC
Permalink
Hey,
addlinestring is topology.topogeo_addlinestring(atopology character
varying, aline geometry, tolerance double precision DEFAULT 0).
My version : POSTGIS="2.2.0dev r12325" GEOS="3.5.0dev-CAPI-1.9.0 r3963"
PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11dev, released 2013/04/13"
LIBXML="2.8.0" TOPOLOGY RASTER
PostgreSQL 9.3.3 on i686-pc-linux-gnu, compiled by gcc (Ubuntu/Linaro
4.6.3-1ubuntu5) 4.6.3, 32-bit


About your idea,
it was our first intend, hence the test with manually using
"topology.addnode(atopology character varying, apoint geometry)",
then manually adding the line : topology.st_addedgemodface(atopology
character varying, anode integer, anothernode integer, acurve geometry).
In an ideal world we would have then used polygonize, but as you said there
is no way to build only adjacency without faces.


In my test , adding points and generating the table for line insertion was
10 sec, the other 70 sec being for line insertion.

I don't understand your last question, do you mean just fill the table
manually without computing adjacencies?

Cheers,
Rémi-C
Post by Rémi Cura
Post by Rémi Cura
Hey,
We tried several way to load an already topologically correct shapefile
into postgis topology .
10k lines loaded
_Using totopogeom : 750 sec
_Using addlinestring : 246 sec
_Manually using addnode and st_addedgemodface : 86 sec.
Still very slow :-/
Considering manually filling table /and/or using GEOS/or using grass
loader.
Interesting numbers, thanks.
By "addlinestring" you mean TopoGeo_addLinestring ?
Is this with postgis trunk ?
Note that ST_AddEdgeModFace (called by TopoGeo_addLinestring)
still performs checks to verify the line you add does not
cross othe redges. There's currently no API exposed for simply
adding the linking attributes (face left/right and next edges).
I was thinking that one idea could be to manually insert all
the nodes and edges and then have a function to "polygonize"
the supposedly pre-noded resultance set. This would be pretty
much the same model used by GEOSPolygonize (requires pre-noded
input) and would probably be faster for batch constructions.
To complete the picture, how long does it take to simply insert
all your lines as edges ? So we get a feeling of IO and referential
integrity checks cost...
--strk;
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
Sandro Santilli
2014-03-20 12:54:08 UTC
Permalink
Post by Rémi Cura
Hey,
addlinestring is topology.topogeo_addlinestring(atopology character
varying, aline geometry, tolerance double precision DEFAULT 0).
My version : POSTGIS="2.2.0dev r12325" GEOS="3.5.0dev-CAPI-1.9.0 r3963"
PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11dev, released 2013/04/13"
LIBXML="2.8.0" TOPOLOGY RASTER
PostgreSQL 9.3.3 on i686-pc-linux-gnu, compiled by gcc (Ubuntu/Linaro
4.6.3-1ubuntu5) 4.6.3, 32-bit
Ok, that one has all the performance improvements in topology done so far.
Post by Rémi Cura
About your idea,
it was our first intend, hence the test with manually using
"topology.addnode(atopology character varying, apoint geometry)",
then manually adding the line : topology.st_addedgemodface(atopology
character varying, anode integer, anothernode integer, acurve geometry).
In an ideal world we would have then used polygonize, but as you said there
is no way to build only adjacency without faces.
Calling ST_AddEdgeModFace already does that for you.
In an ideal world you would use another function for that.
Actually, in that ideal world you would also avoid to call
AddNode once for each connected edge, right ? And in the same
ideal world AddNode would not go checking for hitting an edge
or another pre-existing node...
Post by Rémi Cura
In my test , adding points and generating the table for line insertion was
10 sec, the other 70 sec being for line insertion.
I don't understand your last question, do you mean just fill the table
manually without computing adjacencies?
Yes, just filling the table.

--strk;
Post by Rémi Cura
Post by Rémi Cura
Post by Rémi Cura
Hey,
We tried several way to load an already topologically correct shapefile
into postgis topology .
10k lines loaded
_Using totopogeom : 750 sec
_Using addlinestring : 246 sec
_Manually using addnode and st_addedgemodface : 86 sec.
Still very slow :-/
Considering manually filling table /and/or using GEOS/or using grass
loader.
Interesting numbers, thanks.
By "addlinestring" you mean TopoGeo_addLinestring ?
Is this with postgis trunk ?
Note that ST_AddEdgeModFace (called by TopoGeo_addLinestring)
still performs checks to verify the line you add does not
cross othe redges. There's currently no API exposed for simply
adding the linking attributes (face left/right and next edges).
I was thinking that one idea could be to manually insert all
the nodes and edges and then have a function to "polygonize"
the supposedly pre-noded resultance set. This would be pretty
much the same model used by GEOSPolygonize (requires pre-noded
input) and would probably be faster for batch constructions.
To complete the picture, how long does it take to simply insert
all your lines as edges ? So we get a feeling of IO and referential
integrity checks cost...
--strk;
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
--
()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt
Rémi Cura
2014-03-20 17:46:51 UTC
Permalink
Yep,
We tried a modified version of AddNode without check (because I already
delete duplicates nodes before calling the function), it was not incredibly
faster, and anyway way faster than line insertion (where I feel we need to
focus now).

I may try to simplify the add edge function, but I don't think this is the
way to go.

It would be better to do batch direct insertion in table
(to be precise : insert all edges into edge_data for the columns edge_id,
start_node, end_node, geom )
, then batch edge_data completion (next_left_edge, next_right_edge, found
with querry on node_id),
then batch compute face (with polygonize) .

This way there would be no edge by edge insertion.


About your question :
only filling the edge_data with geometry (on next_left etc) is about 4.5
sec (on par with inserting all the points) for 10k lines. ON this 4.5 sec
you can remove about 3 sec related to my data selection process (I only
insert a part of a table).
This way, I would say that taking 1.5 sec to insert 10 k lines+ check
constraints and update index is OK for me.

Still a long way to go from 80 sec to few sec (10 sec would be cool, if it
scales well).

Cheers,

Rémi-C
Post by Sandro Santilli
Post by Rémi Cura
Hey,
addlinestring is topology.topogeo_addlinestring(atopology character
varying, aline geometry, tolerance double precision DEFAULT 0).
My version : POSTGIS="2.2.0dev r12325" GEOS="3.5.0dev-CAPI-1.9.0 r3963"
PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.11dev, released 2013/04/13"
LIBXML="2.8.0" TOPOLOGY RASTER
PostgreSQL 9.3.3 on i686-pc-linux-gnu, compiled by gcc (Ubuntu/Linaro
4.6.3-1ubuntu5) 4.6.3, 32-bit
Ok, that one has all the performance improvements in topology done so far.
Post by Rémi Cura
About your idea,
it was our first intend, hence the test with manually using
"topology.addnode(atopology character varying, apoint geometry)",
then manually adding the line : topology.st_addedgemodface(atopology
character varying, anode integer, anothernode integer, acurve geometry).
In an ideal world we would have then used polygonize, but as you said
there
Post by Rémi Cura
is no way to build only adjacency without faces.
Calling ST_AddEdgeModFace already does that for you.
In an ideal world you would use another function for that.
Actually, in that ideal world you would also avoid to call
AddNode once for each connected edge, right ? And in the same
ideal world AddNode would not go checking for hitting an edge
or another pre-existing node...
Post by Rémi Cura
In my test , adding points and generating the table for line insertion
was
Post by Rémi Cura
10 sec, the other 70 sec being for line insertion.
I don't understand your last question, do you mean just fill the table
manually without computing adjacencies?
Yes, just filling the table.
--strk;
Post by Rémi Cura
Post by Rémi Cura
Post by Rémi Cura
Hey,
We tried several way to load an already topologically correct
shapefile
Post by Rémi Cura
Post by Rémi Cura
Post by Rémi Cura
into postgis topology .
10k lines loaded
_Using totopogeom : 750 sec
_Using addlinestring : 246 sec
_Manually using addnode and st_addedgemodface : 86 sec.
Still very slow :-/
Considering manually filling table /and/or using GEOS/or using grass
loader.
Interesting numbers, thanks.
By "addlinestring" you mean TopoGeo_addLinestring ?
Is this with postgis trunk ?
Note that ST_AddEdgeModFace (called by TopoGeo_addLinestring)
still performs checks to verify the line you add does not
cross othe redges. There's currently no API exposed for simply
adding the linking attributes (face left/right and next edges).
I was thinking that one idea could be to manually insert all
the nodes and edges and then have a function to "polygonize"
the supposedly pre-noded resultance set. This would be pretty
much the same model used by GEOSPolygonize (requires pre-noded
input) and would probably be faster for batch constructions.
To complete the picture, how long does it take to simply insert
all your lines as edges ? So we get a feeling of IO and referential
integrity checks cost...
--strk;
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
--
() ASCII ribbon campaign -- Keep it simple !
/\ http://strk.keybit.net/rants/ascii_mails.txt
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
Sandro Santilli
2014-03-21 09:40:59 UTC
Permalink
Post by Rémi Cura
Yep,
We tried a modified version of AddNode without check (because I already
delete duplicates nodes before calling the function), it was not incredibly
faster, and anyway way faster than line insertion (where I feel we need to
focus now).
Show numbers, c'mon! :)
Post by Rémi Cura
I may try to simplify the add edge function, but I don't think this is the
way to go.
For sure the AddEdge funtion makes a lot more checks.
I was actually thinking to split that myself between a checking and
non-checking version.
Also the checks themselves might have space for optimizations.
Post by Rémi Cura
It would be better to do batch direct insertion in table
(to be precise : insert all edges into edge_data for the columns edge_id,
start_node, end_node, geom )
, then batch edge_data completion (next_left_edge, next_right_edge, found
with querry on node_id),
then batch compute face (with polygonize) .
I agree this would be also an interesting way to proceed.
Still I don't think polygonize would be that fast (not right now).
Post by Rémi Cura
This way there would be no edge by edge insertion.
But there'll still be edge-by-edge updates, which would be still
expensive (every update is like a new insert).
Post by Rémi Cura
only filling the edge_data with geometry (on next_left etc) is about 4.5
sec (on par with inserting all the points) for 10k lines. ON this 4.5 sec
you can remove about 3 sec related to my data selection process (I only
insert a part of a table).
This way, I would say that taking 1.5 sec to insert 10 k lines+ check
constraints and update index is OK for me.
So it makes 1.5 seconds for 10k edges + 4.5 seconds for their unique points
(how many ?)
Post by Rémi Cura
Still a long way to go from 80 sec to few sec (10 sec would be cool, if it
scales well).
Well, if you could compute all the linking in memory it'd be about those
inserts only (< 10 secs for the inserts). Time to look at GEOS GeomGraph
capability of representing all we need ?

--strk;
Rémi Cura
2014-03-25 14:13:13 UTC
Permalink
Back after some other testing .

(_from memory, suppressing checks in addpoints was about 2 sec instead of 3
sec.
_About 21k unique points.)


Here is a little comparison :
(32k polylines,21k unique points, already 'topological'( no need to cut
the lines) ,with a lot of attributes)

Postgis Topology
Postgis Topology (addtopogeom) : 700+ sec for 10k lines, (??about 1600 sec
for 32k lines??)
Postgis Topology (addlinestring) : 240+ sec for 10k lines, (??about 780
sec for 32k lines??)
Postgis Topology (addnode, addedge) : 80sec for 10 k lines (??about 260sec
for 32k lines??)

Grass Gis 7 :
Importing everything : 10 sec
Importing only geom (no attributes) : 5 sec

Importing everything then Exporting all topologies to postgis topology :
150 sec
Declare postgis topo as external source and fill it with shapefile : 145 sec

Geos : (upper bound measured trough st_polygonize)
ST_Polygonize on everything (aggregates on 32k lines) : about 5 sec

TopoJSON (according to previous thread : postgis topology/60 )
Importing geometry (extrapolated) : about 30 sec?

CGAL (half edge) :
hopefully soon

Geoxygene (JTS) :
hopefully soon

I succeeded in putting the grass vector layer into postgis topology as it
is proposed here
http://grasswiki.osgeo.org/wiki/PostGIS_Topology.<http://grasswiki.osgeo.org/wiki/PostGIS_Topology>
Maybe it could be better configured, but it is still slow (I tried both
proposed ways).

About using GEOS as backend : currently there is no way to access
topological information from geos (all methods are private).

Cheers,
Rémi-C
Post by Rémi Cura
Post by Rémi Cura
Yep,
We tried a modified version of AddNode without check (because I already
delete duplicates nodes before calling the function), it was not
incredibly
Post by Rémi Cura
faster, and anyway way faster than line insertion (where I feel we need
to
Post by Rémi Cura
focus now).
Show numbers, c'mon! :)
Post by Rémi Cura
I may try to simplify the add edge function, but I don't think this is
the
Post by Rémi Cura
way to go.
For sure the AddEdge funtion makes a lot more checks.
I was actually thinking to split that myself between a checking and
non-checking version.
Also the checks themselves might have space for optimizations.
Post by Rémi Cura
It would be better to do batch direct insertion in table
(to be precise : insert all edges into edge_data for the columns edge_id,
start_node, end_node, geom )
, then batch edge_data completion (next_left_edge, next_right_edge, found
with querry on node_id),
then batch compute face (with polygonize) .
I agree this would be also an interesting way to proceed.
Still I don't think polygonize would be that fast (not right now).
Post by Rémi Cura
This way there would be no edge by edge insertion.
But there'll still be edge-by-edge updates, which would be still
expensive (every update is like a new insert).
Post by Rémi Cura
only filling the edge_data with geometry (on next_left etc) is about 4.5
sec (on par with inserting all the points) for 10k lines. ON this 4.5 sec
you can remove about 3 sec related to my data selection process (I only
insert a part of a table).
This way, I would say that taking 1.5 sec to insert 10 k lines+ check
constraints and update index is OK for me.
So it makes 1.5 seconds for 10k edges + 4.5 seconds for their unique points
(how many ?)
Post by Rémi Cura
Still a long way to go from 80 sec to few sec (10 sec would be cool, if
it
Post by Rémi Cura
scales well).
Well, if you could compute all the linking in memory it'd be about those
inserts only (< 10 secs for the inserts). Time to look at GEOS GeomGraph
capability of representing all we need ?
--strk;
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
Sandro Santilli
2014-03-28 15:04:48 UTC
Permalink
Post by Rémi Cura
Back after some other testing .
(_from memory, suppressing checks in addpoints was about 2 sec instead of 3
sec.
_About 21k unique points.)
(32k polylines,21k unique points, already 'topological'( no need to cut
the lines) ,with a lot of attributes)
Postgis Topology
Postgis Topology (addtopogeom) : 700+ sec for 10k lines, (??about 1600 sec
for 32k lines??)
Postgis Topology (addlinestring) : 240+ sec for 10k lines, (??about 780
sec for 32k lines??)
Postgis Topology (addnode, addedge) : 80sec for 10 k lines (??about 260sec
for 32k lines??)
Importing everything : 10 sec
Importing only geom (no attributes) : 5 sec
150 sec
Declare postgis topo as external source and fill it with shapefile : 145 sec
Nice! GRASS seems to be the way to go here.
It's interesting to see how a 10-seconds process becomes a 150-seconds
one to ensure output is modeled after PostGIS/ISO model. There's maybe
space for optimization there.

--strk;

()  ASCII ribbon campaign -- Keep it simple !
/\  http://strk.keybit.net/rants/ascii_mails.txt
Rémi Cura
2014-03-28 15:11:18 UTC
Permalink
Surely !

I have to outline that grass export to postgis topology is still in beta
mode.
Also, I have a feeling the grass export to postgis topology uses the
postgis topology functions, which kind of defeat the purpose (should bulk
insert into tables).


I sent a separate message, but GEOS is also promising :
5 seconds to load everything, build topology, compute face, output faces,
impressive!

I shall have CGAL benchmark soon.

Cheers,

Rémi-C
Post by Rémi Cura
Post by Rémi Cura
Back after some other testing .
(_from memory, suppressing checks in addpoints was about 2 sec instead
of 3
Post by Rémi Cura
sec.
_About 21k unique points.)
(32k polylines,21k unique points, already 'topological'( no need to cut
the lines) ,with a lot of attributes)
Postgis Topology
Postgis Topology (addtopogeom) : 700+ sec for 10k lines, (??about 1600
sec
Post by Rémi Cura
for 32k lines??)
Postgis Topology (addlinestring) : 240+ sec for 10k lines, (??about 780
sec for 32k lines??)
Postgis Topology (addnode, addedge) : 80sec for 10 k lines (??about
260sec
Post by Rémi Cura
for 32k lines??)
Importing everything : 10 sec
Importing only geom (no attributes) : 5 sec
150 sec
Declare postgis topo as external source and fill it with shapefile : 145
sec
Nice! GRASS seems to be the way to go here.
It's interesting to see how a 10-seconds process becomes a 150-seconds
one to ensure output is modeled after PostGIS/ISO model. There's maybe
space for optimization there.
--strk;
() ASCII ribbon campaign -- Keep it simple !
/\ http://strk.keybit.net/rants/ascii_mails.txt
_______________________________________________
postgis-users mailing list
http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users
Sandro Santilli
2014-03-28 15:39:12 UTC
Permalink
Post by Rémi Cura
Surely !
I have to outline that grass export to postgis topology is still in beta
mode.
Also, I have a feeling the grass export to postgis topology uses the
postgis topology functions, which kind of defeat the purpose (should bulk
insert into tables).
Indeed! Start a thread on grass dev list ?
Keep me in CC is you do :)
Post by Rémi Cura
5 seconds to load everything, build topology, compute face, output faces,
impressive!
Yep, I can't find that mail, but I remember having read it.
The only con I can think of about exposing the GEOS geomgraph is
that it'll then need to be maintained stable :)

But can't really tell you if it makes it possible to associate
pre-existing topology primitive identifiers to its own structure
(for updates). It'd be an interesting experiment, but personally
I lack time for it.

--strk;

Loading...