Tuesday, September 30, 2014

JSON UDF's for MySQL in Labs - Take 2


About a year ago I wrote a little on the MySQL JSON UDF's from the MySQL Labs.  You can find the original article and testing steps here.

I decided with the most recent Labs release (0.3.2) that Sveta put out I would try this again and see how far we've come.  For reference, the work is documented also here on Sveta's blog - https://blogs.oracle.com/svetasmirnova/

I repeated my exact same steps to create the table and load it full of the most recent zips.json data straight from the MongoDB folks.  After running a handful of documents through a quick external validation via the JSONLint website (quick plug, I love this thing, especially when I have to edit IAM policies for AWS stuff) I set about loading them in again using the perl 1.5 liner I used last time.

To start with, the installation and documentation are much improved.  I was impressed with just how simple it was and how flushed out that README file is becoming, two things I had previously commented on that needed some small improvement.

Now, the fun part.

Data loaded in normally, cleanly, nicely using the same steps as before.

wget -q http://media.mongodb.org/zips.json

vim json_insert.pl

#!/usr/bin/perl
while (<>) {chomp; printf("INSERT INTO test.jtest (jdoc) VALUES(\'%s\');\n", $_);}


CREATE DATABASE IF NOT EXISTS test;
USE test;
CREATE TABLE `jtest` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `jdoc` blob,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

cat zips.json | ./json_insert.pl | mysql

Ok, so far so good.  Now let's try the queries from last time again:

SELECT json_extract(jdoc, "city") city, json_extract(jdoc, "pop") pop FROM test.jtest LIMIT 10;

+------+------+
| city | pop  |
+------+------+
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
+------+------+
10 rows in set (0.00 sec)

Hmm, that looks familiar...  Ok, maybe we're still working with the more formal definition of JSON here and I need to quote both key and value again.  I'll check with that whole 'json_valid' UDF here:

mysql> SELECT json_valid(jdoc) isValid, COUNT(*) FROM jtest GROUP BY isValid;
+---------+----------+
| isValid | COUNT(*) |
+---------+----------+
|       1 |    29353 |
+---------+----------+
1 row in set (0.42 sec)


Hmm, a few more zip codes than last time, but otherwise it claims the column contains nice, valid JSON documents.  Last time to check functionality I went ahead and replaced the geographical coordinates with just the word "blank" in quotes to satisfy the requirement of the fully quoted key and value pairs, also removing any list as a value.  This time, repeating the process nets me a familiar:

SELECT json_extract(jdoc, "city") city, json_extract(jdoc, "pop") pop FROM test.jtest LIMIT 10;

+------+------+
| city | pop  |
+------+------+
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
+------+------+
10 rows in set (0.00 sec)

So, that brings us right back where we started.  I haven't played too much with the other functions yet, I admit.  I keep getting stuck with this one, which at least to me and for my uses is the big one (if it works, obviously json_append(), json_replace(), and json_set() will be big ones to use for me)

Playing around a little more I am finding the same thing again here.  This is still promising and I still hope it works, but it is not working the way I would expect it to and it is not working in any of the other variations I have concocted so far.  I think I am next going to play with the built-in PostgreSQL functionality for JSON in 9.3 and see how it compares.  Stay tuned for that in the next few days, I hope.

In the meantime, Sveta - I love your work, I think I'm following the examples in the README just fine, but what am I missing here?  I'd love to show this working here.

No comments:

Post a Comment