Tuesday, November 29, 2011

Breaking change in calling Groovy on 1.8 upgrade

I've been bitten by this a couple of times now, so for anyone else's benefit: If you have a bat file that calls a groovy program, you may notice surprising behaviour after an upgrade from 1.7.x to 1.8.x (I went from 1.7.4 to 1.8.4).

If your bat file looks something like:
..some stuff..

groovy myGroovy
copy xyz abc

... more stuff ..
Then in 1.7.4 you would have called groovy.exe, executed the program, then continued to copying the file. But in 1.8.x groovy.exe is deprecated so instead you execute groovy.bat. Unfortunately, when a Windows bat script calls another in that way, it effectively jumps to the script (with no return) so the script finishes at the end of groovy.bat. To fix this, use the Windows CALL instruction:
..some stuff..

call groovy myGroovy
copy xyz abc

... more stuff ..
With the CALL, the groovy.bat script executes and then returns control to your script, and the copy and more stuff actually happens.

NOTE: I think the reason I have the problem is that I installed the generic groovy rather than using the specific windows installer (eg here). But codehaus seems to be down right now.


Thursday, July 28, 2011

MySQL Group By is a little too indulgent

After 30 years of Oracle, I've found myself using MySQL recently. I came across a little thing that surprised me. I'm by no means the first to trip over this - I found this 2006 post from Peter Zaitsey on the same topic.

MySQL lets you write a group by statement that references columns that aren't in the group by, and aren't aggregates. For example:

mysql> select table_name, column_name, count(*)
-> from information_schema.columns
-> where table_schema = 'information_schema'
-> group by table_name
-> limit 5;
+---------------------------------------+--------------------+----------+
| table_name | column_name | count(*) |
+---------------------------------------+--------------------+----------+
| CHARACTER_SETS | CHARACTER_SET_NAME | 4 |
| COLLATIONS | COLLATION_NAME | 6 |
| COLLATION_CHARACTER_SET_APPLICABILITY | COLLATION_NAME | 2 |
| COLUMNS | TABLE_CATALOG | 19 |
| COLUMN_PRIVILEGES | GRANTEE | 7 |
+---------------------------------------+--------------------+----------+
5 rows in set (0.07 sec)


A similar query from any version of Oracle would fail:

SQL> select table_name, column_name, count(*)
2 from dba_tab_columns
3 group by table_name;
select table_name, column_name, count(*)
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression


In effect MYSQL is doing the GROUP BY as requested, and giving you the first value it comes across for the un-aggregated columns (COLUMN_NAME in this example). A near equivalent Oracle query would be:

SQL> select table_name, min(column_name), count(*)
2 from dba_tab_columns
3* group by table_name


TABLE_NAME MIN(COLUMN_NAME) COUNT(*)
------------------------------ ------------------------------ ----------
ICOL$ BO# 14
PROXY_ROLE_DATA$ CLIENT# 3
TS$ AFFSTRENGTH 32
ARGUMENT$ ARGUMENT 25
IDL_CHAR$ LENGTH 6
TRIGGER$ ACTION# 19
TRIGGERCOL$ COL# 5


But in the Oracle case we are explicitly selecting the MIN(column_name), whereas MySQL's laxer behaviour is just picking the first column name at random (or rather, dependent on the execution plan).

So: when grouping in MySQL, make double certain that your SQL is really returning the number of rows you expected. In our example it is possible that the intention was actually the very different:

mysql> select table_name, column_name, count(*)
-> from information_schema.columns
-> where table_schema = 'information_schema'
-> group by table_name, column_name
-> limit 20;
+---------------------------------------+--------------------------+----------+
| table_name | column_name | count(*) |
+---------------------------------------+--------------------------+----------+
| CHARACTER_SETS | CHARACTER_SET_NAME | 1 |
| CHARACTER_SETS | DEFAULT_COLLATE_NAME | 1 |
| CHARACTER_SETS | DESCRIPTION | 1 |
| CHARACTER_SETS | MAXLEN | 1 |
| COLLATIONS | CHARACTER_SET_NAME | 1 |
| COLLATIONS | COLLATION_NAME | 1 |
| COLLATIONS | ID | 1 |
| COLLATIONS | IS_COMPILED | 1 |
| COLLATIONS | IS_DEFAULT | 1 |
| COLLATIONS | SORTLEN | 1 |
| COLLATION_CHARACTER_SET_APPLICABILITY | CHARACTER_SET_NAME | 1 |
| COLLATION_CHARACTER_SET_APPLICABILITY | COLLATION_NAME | 1 |
| COLUMNS | CHARACTER_MAXIMUM_LENGTH | 1 |
| COLUMNS | CHARACTER_OCTET_LENGTH | 1 |
| COLUMNS | CHARACTER_SET_NAME | 1 |
| COLUMNS | COLLATION_NAME | 1 |
| COLUMNS | COLUMN_COMMENT | 1 |
| COLUMNS | COLUMN_DEFAULT | 1 |
| COLUMNS | COLUMN_KEY | 1 |
| COLUMNS | COLUMN_NAME | 1 |
+---------------------------------------+--------------------------+----------+
20 rows in set (0.06 sec)

Happy debugging everyone!