Preferisco - thoughts about Oracle, EAI/ETL and more: 2008

Friday, November 28, 2008

Unexpectedly honest job posting

I recently joined the Oracle Connections group on Linked-In and I'm getting regular daily mails with job postings and searches for work. Mostly harmless, but I've just seen a great one:

Lead Oracle Confrigurators NEEDED

(You may need to be a member of Linked-In and/or the Oracle Connections group to follow the link).

I think we've all been there, confRigurating away to our heart's content, haven't we? It certainly explains a lot of the problems we see in production.

Sunday, November 02, 2008

OT: 3 Mobile Broadband doesn't much like Gmail and Blogger

I use 3 mobile broadband in the UK (and Italy) and I have only a couple of nags about it:

I can't use FireFox 3 and Gmail together over mobile broadband - I have to switch to IE7 and sometimes I have to downgrade Gmail to the simple HTML version. I have 2 gmail accounts (one work, one private) and the problem seems to be worse on the latter. The symptom is that the loading bar is followed by a blank screen and the status "Done".
I simply cannot seem login to blogger.com over mobile broadband - hence no posts during the week (probably a good thing as I'm supposed to be working). I get a 404.

These problems seem to persist whether I am in Italy with no bars on my reception, or in the west end with 5 bars. Does anyone have any idea what's going on?

Thursday, September 25, 2008

Exadata - has it been in development for two years?

Just my nit-picking mind, but why does the Exadata technical white paper say (at the time of writing at least) that it is "Copyright © 2006, Oracle"? I don't think they've been working on it that long - much more likely some soon-to-be-embarrassed technical writer has cut and paste the standard boilerplate from an out of date source.

What, no rule-driven content management?

Exadata and the Database Machine - the Oracle "kilopod"

There has already been plenty of interesting posts about Oracle Exadata - notably of course from Kevin Closson here and here (update: and this analysis from Christo Kutrovsky)- but I just have one thing to say.

Larry Ellison was quoted in a number of reports saying the Oracle Database Machine "is 1,400 times larger than Apple’s largest iPod".

Larry, when you want to get over that something is big - really big that is, industrial scale even - just don't compare it with a (however wonderful) consumer toy. Not even with 1,400 of them. 1.4 kilopods is so not a useful measure.

By the way, can I trademark the word kilopod please? (presumably not - a quick google found a 2005 article using the same word, and there is some kind of science-in-society blog at kilopod.com).

Thursday, September 04, 2008

Back to the future - or is that forward to the past

I'm starting a new contract this coming Monday. I'd better keep the client confidential until I've found out how they feel about blogs, but the job revolves around data matching and data quality, using Oracle and SSA.

At the interview, I found myself less than 100m from the site of my first ever "proper" job (the old Scicon offices are now an upmarket West End hotel). So just 28 years and 1 month later, I will be sauntering along Oxford Street once again.

I'll be commuting daily at first, but I will try to stay down during the week at least some of the time if I can find somewhere cheap, clean and convenient. Any old colleagues around central London - sometime between now and Christmas we should meet up ...

Friday, July 25, 2008

Microsoft acquires DATAllegro

Lots of interesting posts on this news, from:

Curt Monash's DBMS2 blog - all his DATAllegro links are here - as there are so many of them. He also links to other comments
Philip Howard at Bloor and IT-Director suggests that MS is assembling an entire DW stack with Zoomix and perhaps one day Kalido and Ab Initio as additional components
Mark Madsen from Intelligent Enterprise says what it means for customers, other vendors and BI
Kevin Closson with his welcome plain-speaking (and Oracle-centric) viewpoint refers back to earlier posts wondering about some specific details of this emperor's clothes
Seth Grimes thinks it's a mistake that will be slow to deliver (and may hurt existing DATAllegro customers); it should have been Dataupia
DATAllegro CEO Stuart Frost sounds happy with his new role

Meanwhile Seth Grimes also uses the news to point up a real difference in quality between Google and Microsoft search capabilities.

Monday, June 30, 2008

ESB consolidation - Progress buys Iona

I guess Iona was one of the first well known Irish software companies, and now it must be one of the last. It has succumbed to Progress Software for $106M. Paul Freemantle beat me to the news that Progress now owns or is a major committer for at least 4 different ESBs - Sonic, Artix, C24 and ActiveMQ/Camel/ServiceMix (plus Actional if you like) - hey, that's more than Oracle isn't it? (or maybe not...).

Saturday, June 21, 2008

First come first severed (sic)

A little off topic perhaps, but sometimes the illiteracy of business communications just amazes me. This is from a Jobserve recruitment ad:

A large railway company are looking to hire a Oracle Developer to join there expanding company on a 3 month rolling contract.

Next comes a fairly detailed set of technical requirements; as it seems to make sense, it was no doubt supplied by the said railway company. Finally the ad ends with this appeal:

The is major role and candidates will be selected on matching skills and on first come first severed basis

Sounds painful! Seems unfair to be severed for being the first response. Perhaps I should wait a week before applying?

Saturday, June 14, 2008

Vote now to open up Metalink!

Richard Harding has proposed on Oracle Mix that Metalink should be opened up:

There is goldmine of useful information in Metalink and having access to it would optimize the efficiency of people using Oracle toolsets, in my opinion enhancing productivity and by inference Oracle adoption globally which would be win win for everyone.

I've expressed the same thoughts myself in the past. So all of you Oracle professionals who would benefit from access to Metalink but are not included in your employer's arrangements (especially for freelancers like me) - go and vote!

Wednesday, June 11, 2008

Another blogger

Good to see Mark Bobak blogging - he's one of the stronger contributors to the Oracle-L mail list. Thanks to Doug Burns for spotting his blog.

Friday, June 06, 2008

BEA Aqualogic broken up

The Register reports on how Oracle is going about the integration / breakup of BEA's Aqualogic SOA products (splitting them between ex-Stellent and Oracle Fusion product lines). Oracle's actions are described as "firm but fair".

Tuesday, May 13, 2008

Performance problems concatenating LOBs

Greg Pike talks about some potential performance pitfalls with LOBs; I would comment on his blog, but it's a login only. Anyway, I posted on these problems last year (specifically with DBMS_LOB.APPEND), and posted some code to deal with it. Batching up appends makes a big difference - a 15-20 fold improvement, in the example shown (against 9.2.0.5).

Thursday, May 08, 2008

MS subpoenas Oracle in Juxtacomm ETL patent case

Vincent McBurney summarises the latest developments in the Juxtacomm "we invented ETL, yeah really, give us all your money" patent case, which I have mentioned before. Dang, don't these "obvious" cases take a lot of time and effort to clear up.

Monday, April 21, 2008

Latrz - for all those web pages you want to read, just not right now

James Strachan blogged about Latrz here. It's a neat little Google app that lets you bookmark and tag pages you want to read later, then come back and read them (and check them off your list) when you've got a spare few minutes. Coming soon, you'll also be able to share the stuff you liked with your friends / work colleagues.

Monday, April 07, 2008

Kalido Business Information Modeler - everyone should have one

I posted about Kalido Business Information Modeler back in February, but now it's gone GA and Philip Howard reviews it at IT-Director. He is pretty enthusiastic and says "everyone should have Kalido Dynamic Information Warehouse and they should certainly have Universal Information Director and Business Information Modeler too."

The only negative he comes up with is "the relatively limited number of platforms that Kalido's software runs on: one would like to see it on Netezza for example, or Teradata".

Monday, March 31, 2008

IBM FastTrack for Source To Target Mapping for DataStage

Vincent McBurney was first to report IBM FastTrack for Source To Target Mapping, and he has been followed up by Philip Howard's note "the business face of data integration" at IT-Director.

I find it interesting that releasing a simple attribute mapping tool is seen as a major breakthrough for the DataStage/Information Server family; Constellar had more or less exactly that 12 years ago (no glossary though); the UI may not have been quite so business-user friendly, but certainly supported point and click, plus simple integration with metadata repositories.

Thursday, March 13, 2008

Vitria brings Web 2.0 and high transaction rates to BPM with M3O

Vitria recently announced its new M3O product, which is claims to be the convergence of BPM, Web 2.0 and Event Processing. Bloor's Simon Holloway reviews it here, where he quotes Vitria saying:

"BPM provides standards-based executable modelling (based on BPMN) on top of business knowledge Repository.
Web 2.0 provides the rich user experience with zero footprint to enable a collaborative design environment.
Event processing provides the support for rule and process definition and real-time runtime performance based on event driven architecture.
Only when you combine these together do you get a fundamentally new user experience with multilayer visualization, collaborative modelling environment, business level abstractions and event management"

Leaping shamelessly onto a passing bandwagon, Vitria explains M3O as "think iPhone meets dashboards" (quoted from ebizQ). The idea is that the "iPhone coolness" of the Web 2.0 interface will remove the gap between business and IT people. Well, as long as it doesn't (like the iPhone) lock users into an expensive long term relationship...

This looks like the first fruits from the return of JoMei Chang as CEO last July and the decision to go private, executed last March.

Wednesday, March 12, 2008

Cape Clear-out

I just caught up with the month-old news that Irish ESB/SOA vendor Cape Clear has been bought by WorkDay, a supplier of ERP SaaS.

Of course, SaaS needs integration, and Cape Clear CEO Annrai O'Toole promises that they will be providing the necessary Integration-on-Demand - but it seems like one of the leading independent commercial vendors has now been marginalised.

Ronan Bradley, former CEO of PolarLake (also Irish - one of our partners at SpiritSoft, and later a client of mine) also worries where the ESB market is going.

Is this a general problem in the middleware market, or is it just 'cos they is Irish? Are small vendors being caught between the rock of large vendors and the hard place of open source? Or is this (as Steve Craggs suggests in a comment on Ronan's post) simply a result of Cape Clear's own hubris?

Sunday, March 09, 2008

Change of scene

My current project is coming to a sudden end, so it looks like my weekly commute from Suffolk to Lancashire will be done by Easter. I've already got two or three interesting (and very different) opportunities to consider over the weekend - and if they don't come off, there's plenty on jobserve. More news as it happens; I am looking forward to spending a little bit more time at home and a little less time on the M6.

Saturday, March 08, 2008

Quantum dot memory

Next time you have trouble with i/o performance, think about this: wouldn't it be great to fit a terabyte of non-volatile RAM onto a one square inch postage stamp, with write times of 6 nanoseconds? Soon quantum-dot memory may do just that for you. That's nearly as fast as regular DRAM, and around 1000 times faster than typical flash memory. Scientists even say they could eventually get the write time down to picoseconds. Scary or what?

Saturday, March 01, 2008

H-Store - a new architectural era, or just a toy?

Philip Howard's commentary Merchant relational databases: over-engineered and out-of-date? supports the idea that perhaps general purpose relational databases should now be treated as "legacy". He references a paper The End of an Architectural Era (It's time for a rewrite) by Michael Stonebraker and others from MIT.

My first thought was that RDBMS developers such as Oracle have seen off previous architectural challengers - most notably object oriented databases (OODBMS) - in the past 25-30 years. What makes Stonebraker's H-Store any different?

First, a quick summary of the paper:

RDBMSs were designed 30 years ago - since then memory and cpu have become faster, cheaper and bigger, changing the balance against magnetic (disc) storage
increasingly they are failing to meet today's complex challenges
niche solutions have overtaken "general purpose" RDBMS in many areas (eg data appliances for business intelligence; specialist text search engines; etc)
and now even OLTP, the "core competence" of the RDBMS, is no longer safe; a new approach (such as H-Store) can easily beat traditional RDBMS by cutting out non-functional architectural features (eg: redo logs stored on disc) and achieving the same goals (ACID transactions) in another way.

The paper claims that H-Store can beat a traditional RDBMS at TPC-C style benchmarks; an early version runs up to 80 times faster than "a very popular RDBMS" which itself underwent several days of tuning by a "professional DBA".

H-Store's secret sauce is that it is (in effect) single threaded. It makes the assumptions that all transactions are very fast; then it executes each transaction in turn. This gets rid of the need for complex read-consistency models. Other optimisations include keeping undo in memory (because transactions are short and sharp) and discarding it at the end of the transaction.

Well, go off and read the paper for the details, but here's what I think.

On the negative side:

As a comparative benchmark, this fails through insufficient disclosure. For a paper that passes as academic, there is remarkably little detail on what they actually did.
Stonebraker assumes that "most" OLTP systems can be represented by a hierarchical model - what he calls a "constrained tree application" (CTA). As a result these applications are relatively easy to partition over a shared-nothing architecture. I wonder whether this is really the case. Parts of your application may be like that, or they may be like that for some periods (during the online day, for example). But even OLTP applications need to manage longer transactions, complex reporting, and updates to the "read only" tables. In his example, he assumes (section 5.0) that the Items table is read only, so it doesn't break his tree and it can easily be replicated. But we know that new items will be added; others will be re-priced, re-categorised, phased out. Can that be handled without interrupting a 24/7 H-Store style application?
He also seems to assume that there is only one axis of partitioning - in his case, the warehouse. But over time, the main focus of interest changes. An order is taken at a shop; it is ordered from a warehouse; it is delivered to the customer. Different "CTAs" at each stage. How does the H-Store morph its representation through the course of the information lifecycle?

On the positive side, though, this represents a call to action for the traditional vendors.

Any performance specialist knows that the best way to tune something is to stop doing it. If we really can change the rules of the game, we can avoid all that expensive "insurance". We're used to making calculated design tradeoffs for performance; this could be just another one of those.
I suspect that the RDBMS vendors will (more or less rapidly) steal any really good ideas. Stonebraker states that "no [RDBMS] system has had a complete redesign since its
inception". But, like the proverbial axe, RDBMS internals have been refreshed, a piece at a time. Oracle's database kernel has had at least two major rehashes in its lifetime. It's well within Oracle's or IBM's capability to incorporate the more realistic ideas from this paper, and to find a way to blend them with current state of the art.
They can also learn from the approach MySQL has taken of supporting multiple storage engines - horses for courses. Oracle already merges XML, relational and OLAP data stores; building in a high performance OLTP kernel to address specific classes of OLTP application is not at all inconceivable. Although it will lead to all sorts of information lifecycle difficulties, we are already used to migrating data from OLTP to OLAP; with good tool support it should be possible to work round the constraints that allow H-Store to dispense with so much that we normally take for granted.

I may revisit this paper to tease out other issues - for example Stonebraker rants against SQL (perhaps he's never got over Ingres being forced by the market to provide SQL rather than the more academically respectable Quel). H-Store uses C++, and may move to Ruby; the implication is that applications will be object-oriented, making row-by-row navigations (like so many J2EE apps, and suspiciously like COBOL/Codasyl) rather than being set-oriented.

Let's watch this space.

Tuesday, February 26, 2008

SYS_CONTEXT versus V$ views for getting session information

A thread on Oracle-L today sidetracked into the use of V$ views for getting hold of session information such as username, SID and module. The poster had a logon trigger that was supposed to record these items for each logon from a particular client.

I don't approve of granting access to V$ views willy nilly; best practice is always to grant the minimum privileges necessary to achieve an objective.

Another poster raised the issue of performance. In the past, SYS_CONTEXT was considered slower than direct access to the views.

So here is a test to compare the two:


set echo off feedback off
set timing on

set termout off

variable v_loops number;
exec :v_loops := 1000000;

set termout on

prompt Testing sys_context

declare
 l_user varchar2(30);
 l_action varchar2(32);
 l_module varchar2(48);
 l_sid number;
        l_loopcount pls_integer := :v_loops;
begin
    for i in 1..l_loopcount loop
 dbms_application_info.read_module(l_module, l_action);
 l_user := sys_context('userenv', 'session_user');
 l_sid := sys_context('userenv', 'sessionid');
    end loop;
end;
/


prompt Testing mystat

declare
 l_user varchar2(30);
 l_action varchar2(32);
 l_module varchar2(48);
 l_sid number;
        l_loopcount pls_integer := :v_loops;
begin
    -- note this only gets one of the three pieces of information
    for i in 1..l_loopcount loop
 select sid
 into   l_sid
 from   v$mystat
 where rownum = 1;
    end loop;
end;
/

And here are the results:


C:\sql>sqlplus testuser/testuser

SQL*Plus: Release 10.2.0.1.0 - Production on Tue Feb 26 22:23:10 2008

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Connected to:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production

SQL> @sys_context_test
Testing sys_context
Elapsed: 00:00:07.31
Testing mystat
Elapsed: 00:00:38.57
SQL>

I suspect that in the past SYS_CONTEXT issued recursive SQL under the covers (just as the SYSDATE pl/sql functions used to, and the USER function and the 11g assignment from a sequence still do).

Now I assume SYS_CONTEXT gets its information directly.

Wednesday, February 20, 2008

The UML industry - as predicted

The Register has been poking fun at UML (and why wouldn't you). They have dug up a 1997 term paper piss-take which was probably pretty funny at the time; now it seems inspired, prescient even.

Monday, February 18, 2008

Shared Business Vocabulary

Following on from my active metadata post yesterday, Mike Ferguson at Dataflux says Shared Business Vocabulary (is) needed everywhere. By "shared business vocabulary" he means metadata, as this earlier post explains. I don't particularly like his phrase - metadata is fine by me - but maybe he's right and we need to use a less loaded (and misunderstood) term for it. And particularly in the warehousey world "metadata" also refers to the dynamics of data (when was it loaded, how long did it take) so there's scope for a better word or phrase.

Whatever that word or phrase is, please please can it not be ontology ....

Sunday, February 17, 2008

Active Metadata

Yesterday I moaned about poor use of metadata. Today I'd like to point to an example of the way metadata can be used actively. Andy Hayler describes Kalido's Business Information Modeller, which allows you actively to control and when necessary reshape the data warehouse by changing metadata; you can generate a BO universe, or deploy metadata to Cognos BI tools (more to come, apparently). You can (apparently) even undo the metadata changes, and see your warehouse as it would have been; kind of like Oracle's flashback recovery, but for data warehouse structures.

See Kalido's podcast, screenshots and other resources.

Saturday, February 16, 2008

2008 Data Management Predictions from Dataflux

Mike Ferguson posted these over a week ago. Key thoughts to take away from my point of view:

Information and data architects will continue to be in demand
Companies will need to invest again in data modeling tools and in data modeling skills
Holding this metadata in spreadsheets is no longer acceptable.

It is depressing to me that so many projects keep their business critical metadata in Word documents, Excel spreadsheets and Visio drawings. Some developers prefer to rely on their IDE - keeping DDL definitions as SQL in text files. That protects the definitions (at least you can find them quickly), but it doesn't get the maximum value out of it.

CASE tools have been around since the mid 80s or before - Oracle's SQL*CASE, now better known as Designer, was under development when I joined in '86 - so how come they are used less and less? They did get a bad name for encouraging complex, expensive and ultimately useless corporate data models - and we're glad to see the back of those, and the ivory towers they came from - but they can still be very helpful in defining and developing the metadata we need as a basis for system development.

I wonder whether the main problem is that many CASE tools are simply too expensive and/or too closed; they just can't cope with all the different kinds and layers of metadata we would like to throw at them, and they don't integrate well with all the other development tools around. Look at Designer - it's been more or less static for the last 10 years, and other Oracle products barely take any notice of it. No wonder it's slowly fading away.

Perhaps it's time for Oracle to get a grip and provide some common repository / metadata management for use across all its myriad of tools? Or for a small vendor or OSS project to take up the challenge? Let me know if you've already found the tool that can pull together ERDs, schema models, UML process diagrams, Discoverer EULs, a BO universe, Warehouse Builder or ODI transformation definitions and all the other kinds of development metadata that projects deal with every day.

Friday, February 15, 2008

Writing XML to a file

Marco Gralike has posted a couple of items at the Amis blog and a later followup on his own blog about ways of writing XML to a file - comparing the use of CLOB2FILE with DBMS_XMLDOM.WRITETOFILE with DBMS_XSLPROCESSOR.CLOB2FILE - and finding the latter faster.

Surely this is simply because the input to the test is an XMLTYPE (the result of his test query).

The call to CLOB2FILE requires an implicit conversion from XMLTYPE to CLOB and then simply serialising the CLOB (a large string already in XML format)
WRITETOFILE is making an implicit conversion from XMLTYPE to DOMDOCUMENT - it is constructing a DOM tree and then walking it to produce the serialised output. Here's the signature for WRITETOFILE:


DBMS_XMLDOM.WRITETOFILE(
            doc IN DOMDOCUMENT,
            fileName IN VARCHAR2,
            charset IN VARCHAR2);

Building that tree is a major overhead - in this case - though obviously it wouldn't be if you actually needed to navigate around the tree adding/moving/updating or pruning nodes and branches.

Update 16-Feb:
Marco has continued the story here, in great detail and at great length (tkprof listings and all). He's particularly interested in file size; I'd be more interested in performance impact of the various steps in his operations. I will try to reproduce his tests and summarise those soon.

Nominative determinism

Nominative determinism: when your job or hobby matches your name. For example, the famous Major Major, from Catch 22. Or the British Antarctic Survey scientist I heard on Radio 4 yesterday explaining how he was going to drill down through 3km of ice - a Mr Core (or was it Bore - they both work, and I was in the car at the time).

The latest to catch my eye was in this quote at El Reg: "The vice president in charge of Red Hat's JBoss middleware Craig Muzilla, said Red Hat is going to undercut giants IBM, Oracle and Microsoft in the saturated and expensive middleware sector."

With a name like Muzilla, I guess working for an OSS company was pretty much inevitable.

DATAllegro, DW appliances and the Oracle lock-in effect

Andy Hayler has an interesting post on DATAllegro - talking about its sweet spot at the very high end of data volumes; apparently helped by a grid with up to 1TB/minute transfer speeds.

What caught my eye was Andy's observation that "Oracle customers can be a harder sell (than Teradata's, for example) not because of the technology, but because of the weight of stored procedures and triggers ... in Oracle's proprietary extension" (ie PL/SQL).

Just looking at the various technologies people talk about on OraNA blogs, I wonder whether the continuing growth in Java and .NET based applications over the last decade - where many of the business rules are executed outside the database - might hurt Oracle precisely by removing the PL/SQL "stickiness" Andy describes.

Admittedly Apex is a counter example - thoroughly PL/SQL, thoroughly Oracle; no chance of moving Apex clients to another RDBMS yet. And the tendency to provide management interfaces through PL/SQL (or very Oracle-specific SQL extensions) also helps to keep Oracle customers tied down. Which is good for us Oracle died-in-the-wool crowd. Whether the lock-in is good for the customers (or even Oracle) is another question; an existing customer lock-in can feel like a lock-out to new customers.

Any thoughts out there?

Regards Nigel

Tuesday, February 05, 2008

Schema Version Control

An outbreak of synchronicity has produced a volley of blogs and questions about database version control in the last few days.

Tom Kyte adds his own twist to this story at Coding Horror.
JAson Hinds asked about database release management on one of the OTN forums.
And Syed Jaffar Hussein (the Human Fly) launched a thread on Oracle-L on the subject.

I seen so many projects take code control seriously but have no real idea how to control a database definition (let alone how to control changes to multiple instances of development, test and live data, both in house and on client sites).

Code control is a walk in the park compared to "schema control" (where the delta , like an ALTER TABLE needs to be developed alongside the new version of the database object creation script).

And "data control" adds even more challenges:

can you upgrade the data in place? or do you need to create a replacement object?
Is there enough space?
Do you need to suspend user access? How long will it take?
what are the chances of reversing the upgrade if something goes wrong?
Will you even know if something goes wrong?
Who decides go forward, go back or muddle through?

A complex application version upgrade should be treated with as much respect as a data migration project. There should be a strategy (sure, it may be a standard strategy). Use standard tools / scripts. The more the process is repeatable - and the more it is repeated - the better. Make sure that all likely error conditions (missing log files, out of DB storage, non-writeable working directories, no listener) are tested for - test the tests!

By the way, the "process" includes all the operating procedures. Make sure that upgrades aren't nannied through by developers. Script the whole thing; have the production DBAs try it out the procedures as early as possible in test to make sure that they understand what's supposed to be happening (and how to know if it worked OK; the best thing is if the script just says:

IT ALL WORKED OK

SOMETHING WENT WRONG IN STEP 13 - SEE LOGFILE xxxyyy13

By the time the schema-and-data upgrade is executed for real, nothing - not even the unexpected - should be unexpected.

Monday, February 04, 2008

Patently ridiculous

SCO has had a pretty hostile press over the last few years as it morphed from a real software business into a patent mining outfit. Well, I've now come across another case of patent ambulance chasing over at Vincent McBurney's blog on ITToolbox. JuxtaComm and Teilhard versus IBM Microsoft Informatica SAP etc tells the story so far - basically that a small company is claiming it invented ETL, patented it in 1997, and can it now have loads-a-money if you please.

I followed Vince's link to Groklaw where InterSystems, one of the parties, is asking for help.

One of the parties is DataMirror (now part of IBM, and the purchaser of my former employer Constellar). As tne SQL Group, Constellar had an ETL system that seems to fit the key claims as early as 1994 (to my personal knowledge - it may have been on the market in 1993)

I'm stunned that (apparently) Oracle has already caved, handing over $2 million. One of the reasons I am so surprised is:

I was actually working for Oracle UK in 1994 - the Constellar Hub (then known as Information Junction) was evaluated for one of my clients (who ended up purchasing ETI instead).
Oracle UK was involved in quite a few pre-sales to Telecoms, Pharma and others over the period 94-2000.
Oracle US consultants helped support the product in at least a few US clients.
Constellar's US office was actually sublet from Oracle Corp, and I know many efforts were made to sell (the product and/or the company) to Oracle in around 1997 timescale.
Oracle's own WarehouseBuilder must date from around then if not earlier.

Of course, Constellar wasn't alone at the time. We competed over time against ETI, Prism, Vmark/Ardent/Ascential, Informatica, and all the others (as well as the ubiquitous roll-your-own, which is still probably the market leader).

Slipping the litigants $2 million may be cheaper than arguing the case for Oracle - but for the rest of us (especially any cash strapped open source projects) it's not an option. And I'm sure a sharp lawyer at IBM might like a bit of ammunition to see them off. Hey IBM, you're innocent!

So if any ETL pioneer (or their heirs and successors) wants my testimony to fight off JuxtaComm's malicious and meritricious suit - just let me know! I'm available for testing out the trans-atlantic business class beds any time you like...

Update 13 March 2008: more from Vincent here.

Monday, January 21, 2008

RAC Stretch Clusters - a survey

There's been an interesting digression on the Oracle-L list today on the subject of RAC stretch clusters.

A stretch cluster is one in which nodes are separated - possibly by several miles - mainly as a precaution against a complete data-centre outage. I know of a couple of examples where separations in the region of 20-30 miles (30-50 km) are either in use or planned, and other posters on Oracle-L have mentioned "several" or "a handful" of implementations worldwide.

Obviously, the main performance issue for a stretch cluster is the latency and bandwidth of the interconnect. The bandwidth isn't affected by distance, but the latency certainly is.

I'd be very interested to hear from anyone who has implemented a stretch cluster (in test or production) with as much detail as you are free to pass on, particularly:

what is the distance between sites
how many nodes at each site
is the workload evenly distributed (active/active homogeneous), partitioned (active/active, heterogeneous) or uneven (active/passive)
some indication of database size and transaction rates (in whatever units are meaningful to you
Any performance issues?

Either email me (nigel at preferisco dot com) or comment below. I would like to publish the results but let me know if any part(s) of your information is too sensitive to be broadcast, even anonymously.

Thanks in advance...

Friday, January 18, 2008

TiddlyWiki

I thought I'd seen all the permutations of Wikis out there: written in php, stored as files; written in PL/SQL and stored in Oracle; and with a range of different markup conventions. And recently I've successfully used wiki style markup to better decorate output from Oracle Designer (of which more later in a separate post).

Today, in one of those serendipitious chances that comes from following a couple of irrelevant but intriguing blog links, I tripped over TiddlyWiki. Rather than giving you a server-side wiki, the whole thing is packaged on the client side. A TiddlyWiki is a single HTML file that contains its own code; which self-edits as you add content. You save the file locally (and can then publish to a website if you like).

As a tool written by web designers it just looks great. You don't just click and get another page; the expanded content (called a Tiddler) zooms out at you. You can choose to expand or collapse whichever tiddlers you like - at the same time. That may be a big advantage over more boring wiki implementations.

For large communities, it's probably all wrong - personal changes in effect fork the entire wiki; but it may be a very effective personal tool - kind of mind mapping on the cheap, but much prettier than editors like treepad (and instantly portable, no software install required - keep it on your USB stick).

Basic usage instructions can be found here - written as a TiddlyWiki so you can try it out. I'm going to see if I can turn a rather unexciting Designer entity report into a thing of beauty...