Tuesday, September 30, 2014

JSON UDF's for MySQL in Labs - Take 2


About a year ago I wrote a little on the MySQL JSON UDF's from the MySQL Labs.  You can find the original article and testing steps here.

I decided with the most recent Labs release (0.3.2) that Sveta put out I would try this again and see how far we've come.  For reference, the work is documented also here on Sveta's blog - https://blogs.oracle.com/svetasmirnova/

I repeated my exact same steps to create the table and load it full of the most recent zips.json data straight from the MongoDB folks.  After running a handful of documents through a quick external validation via the JSONLint website (quick plug, I love this thing, especially when I have to edit IAM policies for AWS stuff) I set about loading them in again using the perl 1.5 liner I used last time.

To start with, the installation and documentation are much improved.  I was impressed with just how simple it was and how flushed out that README file is becoming, two things I had previously commented on that needed some small improvement.

Now, the fun part.

Data loaded in normally, cleanly, nicely using the same steps as before.

wget -q http://media.mongodb.org/zips.json

vim json_insert.pl

#!/usr/bin/perl
while (<>) {chomp; printf("INSERT INTO test.jtest (jdoc) VALUES(\'%s\');\n", $_);}


CREATE DATABASE IF NOT EXISTS test;
USE test;
CREATE TABLE `jtest` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `jdoc` blob,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

cat zips.json | ./json_insert.pl | mysql

Ok, so far so good.  Now let's try the queries from last time again:

SELECT json_extract(jdoc, "city") city, json_extract(jdoc, "pop") pop FROM test.jtest LIMIT 10;

+------+------+
| city | pop  |
+------+------+
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
+------+------+
10 rows in set (0.00 sec)

Hmm, that looks familiar...  Ok, maybe we're still working with the more formal definition of JSON here and I need to quote both key and value again.  I'll check with that whole 'json_valid' UDF here:

mysql> SELECT json_valid(jdoc) isValid, COUNT(*) FROM jtest GROUP BY isValid;
+---------+----------+
| isValid | COUNT(*) |
+---------+----------+
|       1 |    29353 |
+---------+----------+
1 row in set (0.42 sec)


Hmm, a few more zip codes than last time, but otherwise it claims the column contains nice, valid JSON documents.  Last time to check functionality I went ahead and replaced the geographical coordinates with just the word "blank" in quotes to satisfy the requirement of the fully quoted key and value pairs, also removing any list as a value.  This time, repeating the process nets me a familiar:

SELECT json_extract(jdoc, "city") city, json_extract(jdoc, "pop") pop FROM test.jtest LIMIT 10;

+------+------+
| city | pop  |
+------+------+
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
+------+------+
10 rows in set (0.00 sec)

So, that brings us right back where we started.  I haven't played too much with the other functions yet, I admit.  I keep getting stuck with this one, which at least to me and for my uses is the big one (if it works, obviously json_append(), json_replace(), and json_set() will be big ones to use for me)

Playing around a little more I am finding the same thing again here.  This is still promising and I still hope it works, but it is not working the way I would expect it to and it is not working in any of the other variations I have concocted so far.  I think I am next going to play with the built-in PostgreSQL functionality for JSON in 9.3 and see how it compares.  Stay tuned for that in the next few days, I hope.

In the meantime, Sveta - I love your work, I think I'm following the examples in the README just fine, but what am I missing here?  I'd love to show this working here.

Tuesday, July 22, 2014

Copying and Pasting SQL

Lesson learned today - Do not trust auto-formatting on websites / instant messaging apps / e-mail.

When copying and pasting queries be careful about auto-formatting.  Many editors want to replace an n-dash '-' with an m-dash '–' .  Some (or possibly most) SQL shells interpret these differently.  Often, only the first is synonymous with subtraction.

It took an embarrassing amount of time today for me to catch that one.  I hope someone is spared a few minutes by my wasted time.


Thursday, May 15, 2014

Older MySQL Poems

By unpopular demand, here are some of the ones from the Percona Live 2013 presentation.  I make no claims of quality:


deadlock detected
we rollback transaction two
err one two one three

- a MySQL Haiku


In the style of Edgar Allan Poe's "The Raven"...

Once upon a SQL query
While I joked with Apple's Siri
Formatting many a logical volume on my quad core
Suddenly there came an alert by email
as of some threshold starting to wail
wailing like my SMS tone
"Tis just Nagios", I muttered,
"Sending alerts unto my phone,
Only this - I might have known..."

Ah distinctly I remember
as I documented for each member
of the team just last November
in the wiki that we keep -
write and keep and nothing more...
When my query thus completed
Fourteen duplicate rows deleted
All my replicas then repeated
repeated the changes as before.
I dumped it all to a shared disk,
kept as a backup forever more.



There once was a small database program
It had InnoDB and MyISAM
One did transactions well,
and the other would crash like hell.
Between the two they used all of my RAM

- A MySQL Limerick -



Round and round the disk drive spins
but SSD sits still and grins.
It is randomly fast
for data current and past.
My database upgrade begins.



Flush local tables
Make and LVM snapshot
Backup with rsync

- A Haiku on simple backups -



Oracle is red,
IBM is blue,
I like stuff for free -
MySQL will do.



There was a dba who lived in a shoe,
He had so many tables he didn't know what to do
He optimized some and analyzed a few
then truncated them all and left for the day



And a throwback to Dr. Seuss...

Do you like MyISAM?
I do not like it, Sam-I-am.
I do not like MyISAM.

Would you use it here or there?
I would not use it here or there.
I would not use it anywhere.
I do not like MyISAM.
I do not like it, Sam-I-am.

Would you like it in an e-commerce site?
Would you like it with in the middle of the night?
I do not like it for an e-commerce site.
I do not like it in the middle of the night.
I would not use it here or there.
I would not use it anywhere.
I do not like MyISAM.
I do not like it Sam-I-am.

Would you could you for foreign keys?
Use it, use it, just use it please!
You may like it, you will see
Just convert these tables three…
Not for foreign keys, not for those tables three!
I will not use it, you let me be!

The DBA's Creed

Alright, it's time for another cheesy one, but here goes.  Feel free to modify it in the comments if you have better:



This is my database.  There are many like it, but this one is mine.
My database is my best friend.  It is my life.  I must master it as I master my life.
My database, without me, is useless.  Without my database, I am useless.
I must query my database well.  I must make queries respond faster than my customers who are trying to kill me.  I must tune the database before he shoots me. I will.
My database and I know what COUNT(*) means.  It's the JOIN's in the query, the temp tables on disk, or the IOPS used.  We know they should be sargable.  We will index properly...

My database is code, even as I, because it is my life.  Thus, I will learn it as a brother.  I will learn its weaknesses, its strength, its changelogs, its plugins, its views and its triggers.
I will keep my buffer_cache filled and ready, even and my dirty writes are flushed.  We will become part of each other.  We will...

Before Codd, I swear this creed.  My database and I are the defenders of my company's data.  We are the masters of our queries. Backups are the saviors of my life.

So be it, until there are no slow queries and all is normalized, and cached.