Once again Vincent McBurney delivers a fantastic summary of the latest state of the Juxtacomm ETL patent case: SQL Server, DB2 and DataStage will fight out Data Integration Patent Infringement.
I'm most interested from the Constellar point of view - I first came across Constellar (then Information Junction) as a product on sale in late 1993 / early 1994 (before joining the company from Oracle in 1995), so it always seemed clear to me that it would qualify as prior art to Juxtacomm's 1998 patent. Oddly, it seems that the parties to the trial have agreed that Constellar Hub (and DataMirror Transformation Server) can be dropped from consideration; they won't be subject to damages - but equally they won't be considered as prior art. I don't understand that, but I guess the IBM lawyers must know what they are doing.
So, the case rumbles on, serving (if nothing else) to show how broken the US software patent system is.
Monday, July 06, 2009
Wednesday, July 01, 2009
SQLstream delivers instant data stream analysis of Mozilla 3.5 downloads
Here are a couple of posts that describe the download monitor/dashboard which is giving up-to-the-second statistics for downloads by country of the latest Mozilla release 3.5 (just about to top 5.5 million downloads since yesterday's launch). The dashboard has been put together with the help of my friends at SQLstream. Just don't try looking at this with Internet Explorer, as it doesn't support HTML5.
SQLstream the Sequel - RealTime Intelligence for Mozilla BI in Action
From ebizQ Presents BI in Action Virtual Conference ...
Julian Hyde on Open Source OLAP. And stuff.: SQLstream powers ... By Julian Hyde SQLstream gathers data from Mozilla's download centers around the world, assigns each record a latitude and longitude, and summarizes the information in a continuously executing SQL query. Data is read with sub-second latencies, ... Julian Hyde on Open Source OLAP.... - http://julianhyde.blogspot. |
From ebizQ Presents BI in Action Virtual Conference ...
Monday, June 15, 2009
Be Alert!
Here's a tale of woe from an organisation I know - anonymised to protect the guilty.
A couple of weeks after a major hardware and operating system upgrade, there was a major foul-up during a weekend batch process. What went wrong? What got missed in the (quite extensive) testing?
The symptom was that batch jobs run under concurrent manager were running late. Very late. In fact, they hadn't run. The external scheduling software had attempted to launch them, but failed. Worse than that, there had been no alerting over the weekend. Operators should have been notified of the failure of critical processes by an Enterprise Management (EM) tool.
Cut to the explanation:
As part of the O/S upgrade, user accounts on the server are now set to be locked out if three or more failed attempts to login are made. Someone in operations-land triggered a lockout on a unix account used to run the concurrent manager. And he didn't report it to anyone to reset it. So that explained the concurrent manager failures.
The EM software that should have woken up the operators also failed. Surprise, surprise: it was using the same (locked-out) unix account.
And finally, the alerting rules recognised all kinds of warnings and errors, but noone had considered the possibility that the EM system itself would fail.
Well, it's only a business system; though a couple of C-level execs got their end of month reports a couple of days late, and there were plenty of red faces, nobody actually died...
Just keep an eye out for those nasty corner cases!
A couple of weeks after a major hardware and operating system upgrade, there was a major foul-up during a weekend batch process. What went wrong? What got missed in the (quite extensive) testing?
The symptom was that batch jobs run under concurrent manager were running late. Very late. In fact, they hadn't run. The external scheduling software had attempted to launch them, but failed. Worse than that, there had been no alerting over the weekend. Operators should have been notified of the failure of critical processes by an Enterprise Management (EM) tool.
Cut to the explanation:
As part of the O/S upgrade, user accounts on the server are now set to be locked out if three or more failed attempts to login are made. Someone in operations-land triggered a lockout on a unix account used to run the concurrent manager. And he didn't report it to anyone to reset it. So that explained the concurrent manager failures.
The EM software that should have woken up the operators also failed. Surprise, surprise: it was using the same (locked-out) unix account.
And finally, the alerting rules recognised all kinds of warnings and errors, but noone had considered the possibility that the EM system itself would fail.
Well, it's only a business system; though a couple of C-level execs got their end of month reports a couple of days late, and there were plenty of red faces, nobody actually died...
Just keep an eye out for those nasty corner cases!
Sunday, June 07, 2009
Oracle Exadata posts #1 TCP-H result
Grag Rahn's Structured Data blog provides the data that Kevin Closson had to remove from his own blog. From an HP/Oracle point of view, a very good performance, reducing cost/QphH by a factor of 4.
However, it is interesting to see that the HP/Oracle solution is still more than 4 times the cost/QphH of the #2 placed Exasol solution (running on Fujitsu Primergy, and reported a year ago) - while the absolute performance improvement is relatively slight (1.16M queries/hr against 1.02M).
However, it is interesting to see that the HP/Oracle solution is still more than 4 times the cost/QphH of the #2 placed Exasol solution (running on Fujitsu Primergy, and reported a year ago) - while the absolute performance improvement is relatively slight (1.16M queries/hr against 1.02M).
Tuesday, April 21, 2009
Flutter is the new Twitter
Worth a look, for those (like me) who find 140 characters too much to hande. Hat-tip to the BCS Oddit blog
.
.
Sunday, February 22, 2009
Doubly dynamic SQL
It is great to see a new post from Oracle WTF last week, after a quiet period. Which reminded me to post this example of a dynamic search.
I won't post the whole thing, and I have disguised the column names to protect the guilty. The basic problem is that the developer didn't quite understand that if you are going to generate a dynamic query, you don't have to include all the possibilities into the final SQL.
Let's say the example is based on books published in a given year. First, to decide whether to do a LIKE or an equality, he did this:
So at runtime you get both predicates coming through. Suppose you wanted an exact search (p_exact=1), for p_title='GOLDFINGER'. We don't know the year of publication so we supply 0. The generated predicates are:
WHERE ((1 = 1 AND pdc.title = 'GOLDFINGER')
OR (1 = 0 AND pdc.title LIKE 'GOLDFINGER%'))
AND (0 = 0 or pdc.year = 0)
Wouldn't the logically equivalent:
WHERE (pdc.title = 'GOLDFINGER')
have been much easier? Add a few of these together and a nice indexed query plan soon descends into a pile of full table scans and humungous hash joins. Oh, and no use of bind variables, so in a busy OLTP application this could sabotage the SQL cache quite quickly.
My favourite part though is with the sort order. The user can choose to order by a number of different columns, either ascending or descending:
Yes, you got it; given that the variable p_sort has been picked from an LOV, the whole piece of PL/SQL can be replaced by:
I won't post the whole thing, and I have disguised the column names to protect the guilty. The basic problem is that the developer didn't quite understand that if you are going to generate a dynamic query, you don't have to include all the possibilities into the final SQL.
Let's say the example is based on books published in a given year. First, to decide whether to do a LIKE or an equality, he did this:
' WHERE' ||
' (('||p_exact||' = 1 AND pdc.title = '||chr(39)||p_title||chr(39)||')' ||
' OR ('||p_exact||' = 0 AND pdc.title LIKE '||chr(39)||l_title||'%'||chr(39 )||')) ' ||
' AND ('||p_year||' = 0 OR pdc.year = '||p_year||')' ||
So at runtime you get both predicates coming through. Suppose you wanted an exact search (p_exact=1), for p_title='GOLDFINGER'. We don't know the year of publication so we supply 0. The generated predicates are:
WHERE ((1 = 1 AND pdc.title = 'GOLDFINGER')
OR (1 = 0 AND pdc.title LIKE 'GOLDFINGER%'))
AND (0 = 0 or pdc.year = 0)
Wouldn't the logically equivalent:
WHERE (pdc.title = 'GOLDFINGER')
have been much easier? Add a few of these together and a nice indexed query plan soon descends into a pile of full table scans and humungous hash joins. Oh, and no use of bind variables, so in a busy OLTP application this could sabotage the SQL cache quite quickly.
My favourite part though is with the sort order. The user can choose to order by a number of different columns, either ascending or descending:
l_order := ' ORDER BY ' ||
' case '||chr(39)||p_sort||chr(39)||' when ''publisher asc'' then publisher end asc, ' ||
' case '||chr(39)||p_sort||chr(39)||' when ''publisher desc'' then publisher end desc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''book_type asc'' then book_type end asc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''book_type desc'' then book_type end desc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''title asc'' then title end asc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''title desc'' then title end desc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''year asc'' then year end asc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''year desc'' then year end desc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''book_id asc'' then book_id end asc,' ||
' case '||chr(39)||p_sort||chr(39)||' when ''book_id desc'' then book_id end desc';
Yes, you got it; given that the variable p_sort has been picked from an LOV, the whole piece of PL/SQL can be replaced by:
l_order := ' ORDER BY ' ||p_sort;
That looks better, doesn't it?
Thursday, February 05, 2009
Tibco RV in a box - would appliances help streaming SQL?
Several others have posted on the new Tibco Messaging Appliance - apparently it's Tibco Rendezvous (RV) in a box OEM'd from Solace Systems. Paul Vincent ponders at the Tibco CEP blog:
- It’s quite feasible that the same approach could be used for “basic” complex event processing operations, especially those that don’t require history (or much persistence)
Monday, February 02, 2009
Analytics as a Service
With all this talk of SQLstream's recent v2.0 launch, I was interested to read Tim Bass's CEP blog posting on Analytics-as-a-Service. He calls it A3S - and rightly avoids calling it AaaS; apart from the fact the X-as-a-Service is almost as cliched as XYZ 2.0 (and equally meaningless), careless use of that sequence of As and Ss could cause spam filters round the world to get over-excited.
If we must have the as-a-service tag, I'd like to trademark BI-as-a-service: BIAS. Apart from being a proper acronym, it also gets across that BI often gives you the answers you think you want - not necessarily the ones you really need.
If we must have the as-a-service tag, I'd like to trademark BI-as-a-service: BIAS. Apart from being a proper acronym, it also gets across that BI often gives you the answers you think you want - not necessarily the ones you really need.
Sunday, February 01, 2009
Shared Feeds
My posting rate has been quite low recently, but I have been enjoying using Google Reader to keep up with what everyone else is saying.
I've been sharing items with some friends / colleagues, but it seems harmless to open this up to the world. The topics are eclectic, but mainly around product development, event stream processing, RDBMS and data warehouse. Maybe the very occasional Dilbert or xkcd. You're all very welcome to see what I've shared - the links are now on the right hand side of this blog. And if you too are using a reader, here is a direct link to my shared items and here is the atom feed for them. If Google could just pack up my daily shares as a single blog post, wouldn't that be great.
I'm still successfully avoiding Twitter... I've no idea how I would find the time, and having seen Philip Schofield mentioned as a user (on a low rent BBC Sunday afternoon show) - and apparently now he is London's #5 tw*t(terer), so it must be hopelessly uncool anyway.
I've been sharing items with some friends / colleagues, but it seems harmless to open this up to the world. The topics are eclectic, but mainly around product development, event stream processing, RDBMS and data warehouse. Maybe the very occasional Dilbert or xkcd. You're all very welcome to see what I've shared - the links are now on the right hand side of this blog. And if you too are using a reader, here is a direct link to my shared items and here is the atom feed for them. If Google could just pack up my daily shares as a single blog post, wouldn't that be great.
I'm still successfully avoiding Twitter... I've no idea how I would find the time, and having seen Philip Schofield mentioned as a user (on a low rent BBC Sunday afternoon show) - and apparently now he is London's #5 tw*t(terer), so it must be hopelessly uncool anyway.
SQLstream launches v2.0 of its Event Stream Processing engine
As well as working as a freelance Oracle consultant, I have spent most of the last year working with SQLstream Inc on their Event Stream Processing engine, version 2.0 of which has now been launched.
My initial point of contact was Julian Hyde, with whom I worked in the Oracle CASE / Oracle Designer team in the early 90s. He went out to Oracle HQ and worked on bit-mapped indexes, then spent time at Broadbase before becoming best known as the founder-architect for the Mondrian OLAP server.
Event Stream Processing (ESP) is a development of what we might have called Active Database a few years ago. Rather than running queries against stored data as in an RDBMS, we register active queries which can be used to filter data arriving from any kind of stream. These active queries behave just like topic subscriptions in a message bus - except that unlike typical JMS implementations, the SQLstream query engine can do more than just filter. SQL queries against the data-in-flight can also:
My experience working at JMS vendor SpiritSoft convinced me of the value of asynchronous message-based techniques, which in the last 10 years have spread out from high-end financial systems to ESB implementations all over the place. Since the early 80s we have seen how the RDBMS swept all other forms of structured data storage away. Now SQLstream's relational messaging approach removes the impedance mismatch between how we store data, and how we process it on the wire. In principle, this architecture can subsume both message-oriented (ESB style) and data-oriented (ETL style) integration architectures.
It should be said that "other ESP engines are available". Oracle itself has two projects on the go: its own CQL which I believe is still internal, and Oracle CEP (the rebranded BEA WebLogic Event Server - which itself is (or was) based on the open source Esper project). These two development threads will no doubt combine at some point (perhaps they already have?). IBM also has two or three independent CEP (complex event processing) projects on the go.
I think the same thing will happen to ESP / CEP as happened to ETL/EAI tools in the last ten or fifteen years. For sure, the database/application server vendors (especially Oracle and IBM) will sell plenty of this software within their respective client bases. An Oracle CEP system that was (or could be) tightly integrated with the RDBMS itself - maybe executing PL/SQL as well as Java functions - would be an easy sell. However multi-vendor sites will be interested in an agnostic / vendor-independent tool as a basis for their integration efforts. Just as Informatica has carved out a place for itself in competition with Oracle's ODI and OWB and IBM's DataStage, so SQLstream and other ESP vendors can fight for the common ground. It will be very interesting to see how it all turns out.
See the SQLstream product page for background, plus posts from Julian Hyde, CTO of SQLstream and Nicholas Goodman, Director of BI Solutions at Pentaho.
PS: here's another post from David Raab.
My initial point of contact was Julian Hyde, with whom I worked in the Oracle CASE / Oracle Designer team in the early 90s. He went out to Oracle HQ and worked on bit-mapped indexes, then spent time at Broadbase before becoming best known as the founder-architect for the Mondrian OLAP server.
Event Stream Processing (ESP) is a development of what we might have called Active Database a few years ago. Rather than running queries against stored data as in an RDBMS, we register active queries which can be used to filter data arriving from any kind of stream. These active queries behave just like topic subscriptions in a message bus - except that unlike typical JMS implementations, the SQLstream query engine can do more than just filter. SQL queries against the data-in-flight can also:
- join data between multiple streams, or between streams and tables
- aggregate data within a stream to give rolling or periodic aggregates based on time or row counts
- apply built-in and user-defined functions to columns within the stream's rows
- build a network of streams and views to support complex transformation and routing requirements
My experience working at JMS vendor SpiritSoft convinced me of the value of asynchronous message-based techniques, which in the last 10 years have spread out from high-end financial systems to ESB implementations all over the place. Since the early 80s we have seen how the RDBMS swept all other forms of structured data storage away. Now SQLstream's relational messaging approach removes the impedance mismatch between how we store data, and how we process it on the wire. In principle, this architecture can subsume both message-oriented (ESB style) and data-oriented (ETL style) integration architectures.
It should be said that "other ESP engines are available". Oracle itself has two projects on the go: its own CQL which I believe is still internal, and Oracle CEP (the rebranded BEA WebLogic Event Server - which itself is (or was) based on the open source Esper project). These two development threads will no doubt combine at some point (perhaps they already have?). IBM also has two or three independent CEP (complex event processing) projects on the go.
I think the same thing will happen to ESP / CEP as happened to ETL/EAI tools in the last ten or fifteen years. For sure, the database/application server vendors (especially Oracle and IBM) will sell plenty of this software within their respective client bases. An Oracle CEP system that was (or could be) tightly integrated with the RDBMS itself - maybe executing PL/SQL as well as Java functions - would be an easy sell. However multi-vendor sites will be interested in an agnostic / vendor-independent tool as a basis for their integration efforts. Just as Informatica has carved out a place for itself in competition with Oracle's ODI and OWB and IBM's DataStage, so SQLstream and other ESP vendors can fight for the common ground. It will be very interesting to see how it all turns out.
See the SQLstream product page for background, plus posts from Julian Hyde, CTO of SQLstream and Nicholas Goodman, Director of BI Solutions at Pentaho.
PS: here's another post from David Raab.
Friday, November 28, 2008
Unexpectedly honest job posting
I recently joined the Oracle Connections group on Linked-In and I'm getting regular daily mails with job postings and searches for work. Mostly harmless, but I've just seen a great one:
(You may need to be a member of Linked-In and/or the Oracle Connections group to follow the link).
I think we've all been there, confRigurating away to our heart's content, haven't we? It certainly explains a lot of the problems we see in production.
Sunday, November 02, 2008
OT: 3 Mobile Broadband doesn't much like Gmail and Blogger
I use 3 mobile broadband in the UK (and Italy) and I have only a couple of nags about it:
- I can't use FireFox 3 and Gmail together over mobile broadband - I have to switch to IE7 and sometimes I have to downgrade Gmail to the simple HTML version. I have 2 gmail accounts (one work, one private) and the problem seems to be worse on the latter. The symptom is that the loading bar is followed by a blank screen and the status "Done".
- I simply cannot seem login to blogger.com over mobile broadband - hence no posts during the week (probably a good thing as I'm supposed to be working). I get a 404.
These problems seem to persist whether I am in Italy with no bars on my reception, or in the west end with 5 bars. Does anyone have any idea what's going on?
Thursday, September 25, 2008
Exadata - has it been in development for two years?
Just my nit-picking mind, but why does the Exadata technical white paper say (at the time of writing at least) that it is "Copyright © 2006, Oracle"? I don't think they've been working on it that long - much more likely some soon-to-be-embarrassed technical writer has cut and paste the standard boilerplate from an out of date source.
What, no rule-driven content management?
What, no rule-driven content management?
Exadata and the Database Machine - the Oracle "kilopod"
There has already been plenty of interesting posts about Oracle Exadata - notably of course from Kevin Closson here and here (update: and this analysis from Christo Kutrovsky)- but I just have one thing to say.
Larry Ellison was quoted in a number of reports saying the Oracle Database Machine "is 1,400 times larger than Apple’s largest iPod".
Larry, when you want to get over that something is big - really big that is, industrial scale even - just don't compare it with a (however wonderful) consumer toy. Not even with 1,400 of them. 1.4 kilopods is so not a useful measure.
By the way, can I trademark the word kilopod please? (presumably not - a quick google found a 2005 article using the same word, and there is some kind of science-in-society blog at kilopod.com).
Larry Ellison was quoted in a number of reports saying the Oracle Database Machine "is 1,400 times larger than Apple’s largest iPod".
Larry, when you want to get over that something is big - really big that is, industrial scale even - just don't compare it with a (however wonderful) consumer toy. Not even with 1,400 of them. 1.4 kilopods is so not a useful measure.
By the way, can I trademark the word kilopod please? (presumably not - a quick google found a 2005 article using the same word, and there is some kind of science-in-society blog at kilopod.com).
Thursday, September 04, 2008
Back to the future - or is that forward to the past
I'm starting a new contract this coming Monday. I'd better keep the client confidential until I've found out how they feel about blogs, but the job revolves around data matching and data quality, using Oracle and SSA.
At the interview, I found myself less than 100m from the site of my first ever "proper" job (the old Scicon offices are now an upmarket West End hotel). So just 28 years and 1 month later, I will be sauntering along Oxford Street once again.
I'll be commuting daily at first, but I will try to stay down during the week at least some of the time if I can find somewhere cheap, clean and convenient. Any old colleagues around central London - sometime between now and Christmas we should meet up ...
At the interview, I found myself less than 100m from the site of my first ever "proper" job (the old Scicon offices are now an upmarket West End hotel). So just 28 years and 1 month later, I will be sauntering along Oxford Street once again.
I'll be commuting daily at first, but I will try to stay down during the week at least some of the time if I can find somewhere cheap, clean and convenient. Any old colleagues around central London - sometime between now and Christmas we should meet up ...
Friday, July 25, 2008
Microsoft acquires DATAllegro
Lots of interesting posts on this news, from:
- Curt Monash's DBMS2 blog - all his DATAllegro links are here - as there are so many of them. He also links to other comments
- Philip Howard at Bloor and IT-Director suggests that MS is assembling an entire DW stack with Zoomix and perhaps one day Kalido and Ab Initio as additional components
- Mark Madsen from Intelligent Enterprise says what it means for customers, other vendors and BI
- Kevin Closson with his welcome plain-speaking (and Oracle-centric) viewpoint refers back to earlier posts wondering about some specific details of this emperor's clothes
- Seth Grimes thinks it's a mistake that will be slow to deliver (and may hurt existing DATAllegro customers); it should have been Dataupia
- DATAllegro CEO Stuart Frost sounds happy with his new role
Monday, June 30, 2008
ESB consolidation - Progress buys Iona
I guess Iona was one of the first well known Irish software companies, and now it must be one of the last. It has succumbed to Progress Software for $106M. Paul Freemantle beat me to the news that Progress now owns or is a major committer for at least 4 different ESBs - Sonic, Artix, C24 and ActiveMQ/Camel/ServiceMix (plus Actional if you like) - hey, that's more than Oracle isn't it? (or maybe not...).
Saturday, June 21, 2008
First come first severed (sic)
A little off topic perhaps, but sometimes the illiteracy of business communications just amazes me. This is from a Jobserve recruitment ad:
- A large railway company are looking to hire a Oracle Developer to join there expanding company on a 3 month rolling contract.
- The is major role and candidates will be selected on matching skills and on first come first severed basis
Saturday, June 14, 2008
Vote now to open up Metalink!
Richard Harding has proposed on Oracle Mix that Metalink should be opened up:
There is goldmine of useful information in Metalink and having access to it would optimize the efficiency of people using Oracle toolsets, in my opinion enhancing productivity and by inference Oracle adoption globally which would be win win for everyone.
I've expressed the same thoughts myself in the past. So all of you Oracle professionals who would benefit from access to Metalink but are not included in your employer's arrangements (especially for freelancers like me) - go and vote!
There is goldmine of useful information in Metalink and having access to it would optimize the efficiency of people using Oracle toolsets, in my opinion enhancing productivity and by inference Oracle adoption globally which would be win win for everyone.
I've expressed the same thoughts myself in the past. So all of you Oracle professionals who would benefit from access to Metalink but are not included in your employer's arrangements (especially for freelancers like me) - go and vote!
Wednesday, June 11, 2008
Another blogger
Good to see Mark Bobak blogging - he's one of the stronger contributors to the Oracle-L mail list. Thanks to Doug Burns for spotting his blog.
Subscribe to:
Posts (Atom)