Preferisco - thoughts about Oracle, EAI/ETL and more

Wednesday, April 03, 2013

Thirty years on, 10gen emulates Oracle

It's now more than thirty years since I first came across the Oracle database. At that time, Oracle had only just got a distributor in the UK (a small part of CACI, with just three staff: Geoff Squire, Mike Evans and Chris Ellis - soon to be joined by Ian Thacker). We selected Oracle for an MOD project and the rest is history.

SQL and relational database were a completely new thing then (actually, any databases were a pretty new thing - I'd got through an entire Computer Science degree without ever coming across even a sniff of a database). During the mid '80s, Oracle UK did a fantastic job of proselytising relational as a concept, and SQL as the database lingua franca. They ran regular "no obligation" seminars that were accessible to both techies and their managers; these were held frequently in London, and from time to time around the rest of the UK (and through Europe as things took off). I don't know whether the same was true in the USA, but I expect it was. The seminars generated a lot of interest, and I'm sure the campaign is one of the principal reasons Oracle was able to steal the top dog position it has held ever since in the RDBMS world.

Well, here we are and 10gen (G+: +MongoDB, http://twitter.com/MongoDB) - developers of MongoDB - seem to have learned the lesson. They are doing a heck of a lot to spread the word, including (since last October) running free introductory online courses. I've just completed the sixth and last full week of M101J - MongoDB for Java Developers (week 7 is the final exam). More on the detail in another post - but the main point is that this relatively small amount of investment in education should generate a lot of demand (not to mention good feeling). If you're interested, the courses are being run on a cycle - next one to start is M102 - MongoDB for DBAs, on April 29 - see https://education.10gen.com/courses for more details.

Even if you don't want to be drinking the Kool-Aid, working through the course is a great way of beginning to understand the strengths and weaknesses of MongoDB in particular, or NoSQL / document oriented databases in general.

Thursday, January 26, 2012

Managing Windows scheduled tasks - SCHTASKS output misleading

Here's a little gem - found on Windows Server SP2 but still there on Windows Server 2008 R2 SP1 at least.

I wanted to write a little script to disable some scheduled tasks (for maintenance) then after a predetermined time to re0-enable them. This is a common support problem, and I find I often complete the maintenance and forget to re-enable the tasks which results in alarms going off - but maybe not until the start of the next working day.

Anyhow, Windows gives you (at least) two ways of interacting with scheduled tasks:

SCHTASKS
PowerShell and the PowerShellPack which includes a TaskScheduler module

Although Powershell is an attractive option for scripting, PowerShellPack is poorly documented and the TaskScheduler module is a bit lacking. You can create, start, stop, register or get a task but there doesn't seem to be a cmdlet for actually enabling or disabling a task.

So, back to using groovy as a wrapper to SCHTASKS. All fine, we can use execute to create a CMD process that calls SCHTASKS /query to get the task status. Here's an example using easy-to-parse CSV format:

C:\>schtasks /query /fo csv /tn "\Apple\AppleSoftwareUpdate"
"TaskName","Next Run Time","Status"
"\Apple\AppleSoftwareUpdate","31/01/2012 11:15:00","Ready"

We can see that "Status" is in field 3 on the heading line, and its value is "Ready" on the data line. That's great.
To disable the task, we can then:

C:\>schtasks /change /disable  /tn "\Apple\AppleSoftwareUpdate"
SUCCESS: The parameters of scheduled task "\Apple\AppleSoftwareUpdate" have been changed.

Now let's check the status again:

C:\>schtasks /query /fo csv /tn "\Apple\AppleSoftwareUpdate"

"TaskName","Next Run Time","Status"
"\Apple\AppleSoftwareUpdate","Disabled",""

Yay! The task is indeed disabled - but look how the status has swapped into field 2 - under "Next Run Time". Presumably because there is no next run time while the task is disabled. A blank 3rd field value has been provided, but it is in the wrong place. Whatever way you list out the data, the error is still there:

C:\>schtasks /query /fo table /tn "\Apple\AppleSoftwareUpdate"

Folder: \Apple
TaskName                                 Next Run Time          Status
======================================== ====================== ===============
AppleSoftwareUpdate                      Disabled


C:\>schtasks /query /fo list /tn "\Apple\AppleSoftwareUpdate"

Folder: \Apple
HostName:      PIERO
TaskName:      \Apple\AppleSoftwareUpdate
Next Run Time: Disabled
Status:
Logon Mode:    Interactive/Background

OK, now I know this, I can work around it. But another example of MS inconsistency (which no doubt is now firmly baked in for "backward compatibility" for ever and a day...

Tuesday, November 29, 2011

Breaking change in calling Groovy on 1.8 upgrade

I've been bitten by this a couple of times now, so for anyone else's benefit: If you have a bat file that calls a groovy program, you may notice surprising behaviour after an upgrade from 1.7.x to 1.8.x (I went from 1.7.4 to 1.8.4).

If your bat file looks something like:

..some stuff..

groovy myGroovy
copy xyz abc

... more stuff ..

Then in 1.7.4 you would have called groovy.exe, executed the program, then continued to copying the file. But in 1.8.x groovy.exe is deprecated so instead you execute groovy.bat. Unfortunately, when a Windows bat script calls another in that way, it effectively jumps to the script (with no return) so the script finishes at the end of groovy.bat. To fix this, use the Windows CALL instruction:

..some stuff..

call groovy myGroovy
copy xyz abc

... more stuff ..

With the CALL, the groovy.bat script executes and then returns control to your script, and the copy and more stuff actually happens.

NOTE: I think the reason I have the problem is that I installed the generic groovy rather than using the specific windows installer (eg here). But codehaus seems to be down right now.

Thursday, July 28, 2011

MySQL Group By is a little too indulgent

After 30 years of Oracle, I've found myself using MySQL recently. I came across a little thing that surprised me. I'm by no means the first to trip over this - I found this 2006 post from Peter Zaitsey on the same topic.

MySQL lets you write a group by statement that references columns that aren't in the group by, and aren't aggregates. For example:


mysql> select table_name, column_name, count(*)
    -> from information_schema.columns
    -> where table_schema = 'information_schema'
    -> group by table_name
    -> limit 5;
+---------------------------------------+--------------------+----------+
| table_name                            | column_name        | count(*) |
+---------------------------------------+--------------------+----------+
| CHARACTER_SETS                        | CHARACTER_SET_NAME |        4 |
| COLLATIONS                            | COLLATION_NAME     |        6 |
| COLLATION_CHARACTER_SET_APPLICABILITY | COLLATION_NAME     |        2 |
| COLUMNS                               | TABLE_CATALOG      |       19 |
| COLUMN_PRIVILEGES                     | GRANTEE            |        7 |
+---------------------------------------+--------------------+----------+
5 rows in set (0.07 sec)

A similar query from any version of Oracle would fail:


SQL> select table_name, column_name, count(*)
  2  from dba_tab_columns
  3  group by table_name;
select table_name, column_name, count(*)
                   *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

In effect MYSQL is doing the GROUP BY as requested, and giving you the first value it comes across for the un-aggregated columns (COLUMN_NAME in this example). A near equivalent Oracle query would be:


SQL> select table_name, min(column_name), count(*)
  2  from dba_tab_columns
  3* group by table_name


TABLE_NAME                     MIN(COLUMN_NAME)                 COUNT(*)
------------------------------ ------------------------------ ----------
ICOL$                          BO#                                    14
PROXY_ROLE_DATA$               CLIENT#                                 3
TS$                            AFFSTRENGTH                            32
ARGUMENT$                      ARGUMENT                               25
IDL_CHAR$                      LENGTH                                  6
TRIGGER$                       ACTION#                                19
TRIGGERCOL$                    COL#                                    5

But in the Oracle case we are explicitly selecting the MIN(column_name), whereas MySQL's laxer behaviour is just picking the first column name at random (or rather, dependent on the execution plan).

So: when grouping in MySQL, make double certain that your SQL is really returning the number of rows you expected. In our example it is possible that the intention was actually the very different:


mysql> select table_name, column_name, count(*)
    -> from information_schema.columns
    -> where table_schema = 'information_schema'
    -> group by table_name, column_name
    -> limit 20;
+---------------------------------------+--------------------------+----------+
| table_name                            | column_name              | count(*) |
+---------------------------------------+--------------------------+----------+
| CHARACTER_SETS                        | CHARACTER_SET_NAME       |        1 |
| CHARACTER_SETS                        | DEFAULT_COLLATE_NAME     |        1 |
| CHARACTER_SETS                        | DESCRIPTION              |        1 |
| CHARACTER_SETS                        | MAXLEN                   |        1 |
| COLLATIONS                            | CHARACTER_SET_NAME       |        1 |
| COLLATIONS                            | COLLATION_NAME           |        1 |
| COLLATIONS                            | ID                       |        1 |
| COLLATIONS                            | IS_COMPILED              |        1 |
| COLLATIONS                            | IS_DEFAULT               |        1 |
| COLLATIONS                            | SORTLEN                  |        1 |
| COLLATION_CHARACTER_SET_APPLICABILITY | CHARACTER_SET_NAME       |        1 |
| COLLATION_CHARACTER_SET_APPLICABILITY | COLLATION_NAME           |        1 |
| COLUMNS                               | CHARACTER_MAXIMUM_LENGTH |        1 |
| COLUMNS                               | CHARACTER_OCTET_LENGTH   |        1 |
| COLUMNS                               | CHARACTER_SET_NAME       |        1 |
| COLUMNS                               | COLLATION_NAME           |        1 |
| COLUMNS                               | COLUMN_COMMENT           |        1 |
| COLUMNS                               | COLUMN_DEFAULT           |        1 |
| COLUMNS                               | COLUMN_KEY               |        1 |
| COLUMNS                               | COLUMN_NAME              |        1 |
+---------------------------------------+--------------------------+----------+
20 rows in set (0.06 sec)

Happy debugging everyone!

Friday, November 26, 2010

OT: Good value first class fare?

From the NationalRail website:

One million squids for a day return?

Monday, June 14, 2010

Using Groovy AntBuilder to zip / unzip files

I've been quiet for quite a while - partly because I am not working with Oracle just at the moment. I have been building some automated workflow systems using Groovy as the scripting language. I've known about Groovy since James Strachan first invented it back in around 2002/3 - but this is the first time I've really been using it in earnest. It's a great for portable scripts, and for integration with Java (it runs in the JVM). It's much friendlier than Java for someone like me who comes from a PL/SQL and C (not C++) background.

Anyhow, I found out about using Groovy Antbuilder tasks, and have been using them to manage zipping / unzipping file sets:


def ant = new AntBuilder();   // create an antbuilder
ant.unzip(  src: planZipFile, dest:workingDirName,  overwrite:"true")

Then I found I wanted to flatten the output (ie don't reproduce the directory structure). The Apache Ant documentation for the unzip task shows the Ant XML:

<unzip src="apache-ant-bin.zip" dest="${tools.home}">
    <patternset>
        <include name="apache-ant/lib/ant.jar"/>
    </patternset>
    <mapper type="flatten"/>
</unzip>

How to add the mapper element?

Well, lots of googling later, I couldn't find an example but I did see the patternset being used. So thanks to that, I found that the Groovy way of expressing the mapper part of this is to add a closure after the call:

def ant = new AntBuilder();
ant.unzip(  src: planZipFile, dest:workingDirName,  overwrite:"true"){ mapper(type:"flatten")};

So I hope someone finds that useful.

Friday, February 12, 2010

Customer satisfaction - the Xerox Effect

Thanks to Martin Widlake for pointing to this gem of a paper from Dennis Adams (pdf), pointing out that an increase in customer satisfaction can lead to an increase in negative feedback, and vice versa. Anyone who has worked in customer support (whether on an internal help desk or for external customers) will have gone through a "why don't they love us, we're doing such a great job for them?" period. This might explain why.

Preferisco - thoughts about Oracle, EAI/ETL and more

Wednesday, April 03, 2013

Thirty years on, 10gen emulates Oracle

Thursday, January 26, 2012

Managing Windows scheduled tasks - SCHTASKS output misleading

Tuesday, November 29, 2011

Breaking change in calling Groovy on 1.8 upgrade

Thursday, July 28, 2011

MySQL Group By is a little too indulgent

Friday, November 26, 2010

OT: Good value first class fare?

Monday, June 14, 2010

Using Groovy AntBuilder to zip / unzip files

Friday, February 12, 2010

Customer satisfaction - the Xerox Effect

Nigel's shared items in Google Reader

Nigel's Shared Items - Get them for yourself

Blog Archive

Labels

Links

About Me

Nigel"s Profile on Linked In

Map