Fantastic! I used the RFC reference XML documents from xml.resource.org, 3124 XML documents, small ones around 4M at all, DataStore over SAPDB, and an MN8 script to load all this documents and store them using SEP (the Simple Exchange Profile) over XML-RPC.
The procedure lasted less than an hour, that means over 60 documents per minute, this is the same as Postgresql, but what is different is that by the time the storing of the documents was completed the indexing was to. With Postgesql this requested around 30 hours.
stores the document as it receives it (it does not alter it), but to
allow very fast structured searches (get me the documents which have
this element containing this value and this attribute with this value)
it breaks the documents in little entities (prety much as Google
does) entities which can be used efficiently (actually use their
indexes in their queries) by relational databases along with keeping
information about the structure of that entity (part of that element
…). That way you can actually query toons of well or not so well
formatted documents using complex structure information and get the
After the 3124 documents, DataStore extracted 27000 entities (most of the time words).
And in all this time I worked as usually on the same computer.
I can’t explain the poor performances of Postgresql,
it does not perform much better on the Linux server either and we tried
many versions, must be something related to their JDBC drivers, in
comparation to IBM DB2 or MySQL is way slower.
However SAPDB rocks and it’s free (GPL/LGPL)
with versions for Linux and Windows (and many other), so if you are
looking for an free, industrial strength relation database look no
further. Should I mention that it comes (the Windows version) with
excellent GUI management tools.