Google+ Followers

woensdag 25 december 2013

Why I like PostgreSQL more than MySQL: XML support

PostgreSQL has native XML support. it can store XML documents in a true XML datatype, it can query XML documents using XPATH and it can generate XML documents from tabledata.

How does this help you in real life?

The one area where most of us are forced to deal with XML is a webshop that uses a fullfillment company. All the data about orders and stockupdates are sent back and forth using XML documents. PostgreSQL's XML support makes processing these messages into your database a breeze.

The following example uses a table to store incoming XML documents and a triggered stored function to process their content into a regular table.

DROP FUNCTION IF EXISTS fnc_process_xml() CASCADE;
DROP TABLE IF EXISTS product_history CASCADE;
DROP TABLE IF EXISTS xml_storage CASCADE;

CREATE TABLE xml_storage (id SERIAL, xmldata XML);
CREATE TABLE product_history (product_id INTEGER, eventtype VARCHAR(100), eventdate TIMESTAMP, quantity INTEGER);
CREATE FUNCTION fnc_process_xml() RETURNS TRIGGER AS
$$
BEGIN
INSERT INTO product_history (product_id, eventtype, quantity, eventdate)
SELECT unnest(xpath('/productdata/products/product/id/text()', NEW.xmldata))::text::integer
,unnest(xpath('/productdata/products/product/event/text()', NEW.xmldata))::text
,unnest(xpath('/productdata/products/product/quantity/text()', NEW.xmldata))::text::integer
,TO_TIMESTAMP((xpath('/productdata/@date', NEW.xmldata))[1]::text,'YYYYMMDDHHMISS');
RETURN NEW;
END;
$$
LANGUAGE PLPGSQL;
CREATE TRIGGER tgr_process_xml AFTER INSERT ON xml_storage FOR EACH ROW EXECUTE PROCEDURE fnc_process_xml();

Now you can insert a document into the storage table:

INSERT INTO xml_storage (id, xmldata) VALUES (1, '<productdata date="20131042124415">
<products>
<product><id>4</id><quantity>3</quantity><event>DELIVERY_RECEIVED</event></product>
<product><id>24</id><quantity>-2</quantity><event>SALE</event></product>
</products></productdata>'::XML);
INSERT INTO product_history (product_id, eventtype, quantity, eventdate)
SELECT unnest(xpath('/productdata/products/product/id/text()', xmldata))::text::integer
,unnest(xpath('/productdata/products/product/event/text()', xmldata))::text
,unnest(xpath('/productdata/products/product/quantity/text()', xmldata))::text::integer
,TO_TIMESTAMP((xpath('/productdata/@date', xmldata))[1]::text,'YYYYMMDDHHMISS')
FROM xml_storage;

And presto: the data ends up in the product_history table.

Generating the outgoing XML is also quite simple. This example puts out a simplified XML for a few orders:

DROP TABLE IF EXISTS orders;
CREATE TABLE  ORDERS (order_nr INTEGER, product_id INTEGER, quantity INTEGER, unit_price NUMERIC(8,2), total_price NUMERIC(8,2));
INSERT INTO orders (order_nr, product_id, quantity, unit_price, total_price)
VALUES (1, 3, 1, 2.40,  2.40), (1, 55, 2, 4.00, 8.00), (2, 150, 1, 14.00, 14.00);
WITH tmp_orderxmls AS (
SELECT XMLELEMENT(NAME SALES_ORDER
, XMLATTRIBUTES(order_nr AS NUMBER), XMLAGG(XMLELEMENT(NAME orderline
, XMLELEMENT(NAME ARTICLE_NUMBER, product_id)
, XMLELEMENT(NAME QUANTITY, quantity)
, XMLELEMENT(NAME UNIT_PRICE, unit_price)
, XMLELEMENT(NAME TOTAL_PRICE, total_price)
))) AS order_xml
FROM orders
GROUP BY order_nr)
SELECT XMLELEMENT(NAME SALES_ORDERS, XMLATTRIBUTES(TO_CHAR(CURRENT_TIMESTAMP,'YYYY-MM-DD HH:MI:SS') AS MESSAGE_DATE), XMLAGG(tmp_orderxmls.order_xml))
FROM tmp_orderxmls;

If you also include the filename then you can do more fun things like refuse to process the same file twice, or choose different INSERT queries to process different fileformats.

You could even take it one step further and use 9.3's COPY FROM PROGRAM to call on a binary to make PgSQL fetch the XML document itself, either through some FTP construct or wget: http://www.postgresonline.com/journal/archives/325-Using-wget-directly-from-PostgreSQL-using-COPY-FROM-PROGRAM.html