Wednesday, December 19, 2007

Latest Bloor market research for Data Migration

You can find this survey either at Informatica http://www.informatica.com/info/bloorwpq407/ or at Bloor http://www.bloor-research.com/research/survey/876/data_migration_survey.html.

Generally sensible throughout but - unusually for Philip Howard's research output -it leaves some unanswered questions.

Even as a statistical forecasting effort, it does leave something to be desired. Rather a lot of tables have been included that don't contribute anythingto the sum of human knowledge - in particular a single CAGR (compound annual growth rate) of 10% appears to have been applied willy-nilly to every single region and sector (confirmed by retrofitting the figures -allowing for rounding to the nearest million). What a shame the research doesn't build on either Bloor or external figures for:-
  • likely frequency of application replacement (by industry: Banking hasa more rapid cycle than Trading Companies, and perhaps by context/application type)
  • industry/regional growth forecasts (eg from OECD)
  • No mention of government / public sector projects This is the biggest area for cockups, especially given the fragmented natureof PFI/PPI in the UK, and similar huckster schemes abroad.

However, the headline figures are worth absorbing, to give an order of magnitude for the market, and more importantly an idea of the extent of project failure.

  • 84% of data migration projects "fail" (not delivered on time and on budget)
  • Cost overruns average 30%; time overruns average 40%
  • Data migration is a $5 billion market this year (or maybe twice that if you throw in smaller projects that come in under the radar for this survery) growing to $12 billion by 2012 (sounds like a lot - but London is spending more than that on the 2012 Olympics...)

At a more detailed level, I'm not entirely comfortable with the attempt at separating "data migration" as a market in its own right from related markets (ETL, EAI, etc). I would prefer to see DM as a particular use case for ETL, EAI and DQ tools; many of the same tools will also have applications for data integration, application integration, business intelligence and other use cases - although mileage may vary.

Howard states that DM only aplies to "one off" migrations and is distinct from data integration; well, yes and no. DM may have to take place over months or years; during that period, "source" and "target" systems will have to coexist somehow. That may be achieved through "temporary" data integration/replication, or by partitioning the data and gradually handing over slices from source to target. Given the ever-changing nature of business, it is by no means inconceivable that the process never reaches its originally planned conclusion (the final decommissioning of the source system). Parts of the "source" may turn out to be worth retaining (ie the economic case for replacement is not viable).

Perhaps the biggest concern I have is that there is no mention of DM as a subset of business change. DM scoping is often imposed by (sometimes poorly considered) "business" considerations and decisions. There needs to be a feedback loop from DM processes into the overall project feasibility, scoping, planning, and costing. How many DM budgets are over budget simply because the budget was unrealistic to start with - because project costing was done using some rule of thumb [eg see John Morris's data migration blog] that happened to be inadequate in the circumstances?

Finally, I am amused (and not at all surprised) to find that hand coding is the market leader at 30%, beating ETL into second place with 28%. Given that ETL tools are often given away with applications, databases, and maybe even with leading brands of Cornflakes it is amazing that this situation hasn't much changed in the last 15-20 years since ETL tools first appeared. I always thought that DIY (roll-your-own for some american speakers) was our (Constellar's) biggest competitor back then - and here's yet more anecdotal evidence. Informatica and other tool developers - not to mention application/product architects - need to understand why that should be:

  • no tool is a magic bullet: even a market leading ETL tool may make some cases easier, but can make a few (important) cases much harder to manage (if all you have is a hammer, everything looks like a nail)
  • tools are expensive to own: they can be expensive to buy, but much more importantly skilled tool users are expensive to train or hire. The migration skill-base is fragmented (what works for PowerCenter is wrong for Oracle Data Integrator; Ascential does things differently from Ab Initio).
  • data migration projects are seen as "boring": so may not always attract the youngest, thrustingest, enterpriseyest architects. Like support, a critical area is treated like a leper in some organisations. No wonder the outcomes are sometimes suboptimal. Those cast into the outer darkness of a migration project eventually either transfer internally (and let their skills degrade), or hop off into a consultancy or frelance contract to rent out their newfound skills. Far too rarely does a competence centre emerge.
  • application architects (for in-house and packaged apps) often forget the requirement for data migration into their whizzy new systems. Nearly twenty years after it was first released, Oracle Apps still doesn't have a fully supported bulk migration solution for even the most basic data (eg Payables 11.5.10 finally added vendors, sites and contacts but still doesn't support their bank accounts...). What's unglamourous for the end user is equally unglamourous for the product developer, it seems.

Ho ho ho! Happy Christmas everyone

Wednesday, December 05, 2007

I cite, you borrow, he steals...

Anyone who appreciates the hard work that goes into a decent presentation or book should read Doug Burns' tale of woe. Isaac Newton is widely quoted acknowledging his debt to other scientists: "If I see further, it is because I am standing on the shoulder of giants". Don't let's tolerate those who seek to take the credit for another's efforts without due courtesy or recompense.

I find it particularly annoying as Doug (whose blog I read regularly, but whom I have never met in person) seems to be particularly (even pathologically?) careful to acknowledge where he gets his ideas from.

Take care to read the comments as well as the blog itself. This one will run and run...

Happy UKOUG by the way (from darkest Lancashire at the moment)!

Wednesday, November 07, 2007

Tombstone, AZ

Just back from a fortnight in Arizona, where we were visiting #1 daughter on exchange at UofA in Tucson. We took in Flagstaff, Williams, the Grand Canyon and Sedona as well as sights in and around Tucson itself.

A personal favourite was the Boothill Graveyard on the outskirts of Tombstone. It seems to have operated only for a very few years - and most of the occupants apparently died a sudden death - shot or hanged in the main (one is specifically shown as 'died of natural causes').

George Johnson was more unlucky than most. I wonder how they found out their mistake?
Posted by Picasa

Wednesday, October 17, 2007

Got an issue with that?

Here's a little summary of issue management systems I found in the Register for use with (agile) software products. It struck a chord with me - not least because I am right now working with a project team of around 30 trying to manage and organise a range of issues on an ever changing series of spreadsheets, using email as the repository.

I could add the example issue tracking system that comes with Oracle Apex - has anyone tried that?

Other wikis with issue management templates / recipes are available - including PmWiki PITS.

Monday, October 15, 2007

Latest Gartner Magic Quadrant for Data Integration

Informatica has helpfully syndicated the latest data integration magic quadrant. Out goes Ab Initio - apparently due to their perennial problems with secrecy (ie they won't tell anyone anything). Only IBM and Informatica make it to poll position (IBM slightly bolstered by its recent purchase of DataMirror). Hummingbird is the back marker, both for vision and ability to execute. Anyway, read it yourselves; no major surprises in there (and why would there be in such a mature market...).

Saturday, October 13, 2007

Is there WebLogic in Oracle's bid for BEA?

Finally, Larry has got out his wallet and made a bid for BEA( which of course they have rejected). This one has been waiting to happen for several years. Oracle's experience with app servers hasn't been entirely happy - but after a slow start they have managed to steamroller their way to a decent market share (often on the back of Oracle Apps); meanwhile most of the (often less well funded) pioneers have fallen by the wayside. Who remembers Persistence? Gemstone? Whatever happened to Sun's variously named products?

So who's left now? Oracle, IBM, BEA; MS IIS for the Netties, and a handful of low/no cost options led by JBoss (aka Red Hat).

Buying BEA presents an interesting marketing challenge in the J2EE app server area - though at least Oracle only has one app server at a time. Thing are much more exciting (ie confused) in the integration / add-ons market; Oracle Fusion is already in a state of upheaval, digesting bits and pieces from Siebel, Sunopsis, Hyperion and others. Throwing in BEA AquaLogic - which itself combines organic development and some BEA acquisitions, I seem to remember - will be "interesting" to say the least.

And don't forget other jewels at BEA. Tuxedo - a real piece of engineering - something Oracle could usefully combine with the best of its own server technology; and JRockit - a high performance JVM.

For those who hate acquisitions, remember: BEA bought in both JRockit (from a startup) and Tuxedo (originally developed at AT&T, bought in 1995)... and come to that, WebLogic itself (in 1998).

Thursday, October 11, 2007

Onwards and upwards

I've recently started a new contract which should prove very interesting. Starting off with a data migration study, I'm expecting there will be lots of performance planning and assurance to follow.

The setting is a rather large and unusual Oracle Apps implementation - client confidential of course. RAC, stretch cluster, massive batch processing, impossible response and resilience targets? Lots for me to (re)learn, and hopefully some useful previous experience to share.

The 250 mile weekly commute is a bit of a pain...

Bottoms up!

Tuesday, September 11, 2007

Cognos partners with Informatica

Cognos and Informatica have announced a strategic relationship; Cognos will sell and support Informatica Data Quality and Data Explorer products, and the two companies will "team to jointly provide customers with data integration capabilities" with a focus on performance management (rather than the wider data integration market?).

Interesting that Informatica's front page still highlights the May press release about being granted an injunction against Business Objects, while Cognos includes this release on its front page. A minor difference in marketing communications, or a subtle indication of who is the senior partner?

Analysis from The Street here, and from Intelligent Enterprise here.

Saturday, September 01, 2007

Finished at last (well, almost)


It's finally complete - we're just back from a lovely break in Sant'Angelo at the new house. Not entirely a holiday - there was lots of cleaning to do, and dozens of lights to buy and wire in (more to come later, along with a significant amount of Ikea furniture to be trucked in and assembled - in November we hope). And I spent most mornings working (those pesky deadlines again) proving at least that the EDGE service (no UMTS in the mountains) and a 20 euro for 500Mb web deal from TIM were enough to keep me going.

The garden remains - a couple of 80 year old olives should have gone in this week, and work will start on the rest in the autumn.
Nothing to do now but work as hard as possible to pay the bills!

Thursday, July 19, 2007

No more mirror, mirror jokes... IBM snaffles DataMirror

IBM bought my former employers DataMirror earlier this week. Obviously their main interests will be the HA product line (a direct value add for them), and then the Transformation Server (TS) real time data integration components. Others have written here and here, so I won't labour that except to say that as part of the package they have won Constellar Hub (formerly dear old Information Junction - there's a phrase that doesn't escape my lips too often).

DataMirror never really "got" Constellar. Sure, I think they sweated the customers enough to make a return on their CDN$10million investment (what a steal that was!), but they never made a serious effort (afaik) to bring together the best of the Hub and TS. A shame, because I think it could have been a winner compared to the still rather tired 1990s EAI products like Informatica, DataStage, Ab Initio etc. In fact, TS+Hub could have turned out not unlike the Oracle Data Integrator (but with added "publish / subscribe", and strong IBM and Oracle partnerships).

Now the boot is on the other foot. Will IBM "get" DataMirror and its various product lines? All we can be reasonably sure of is that the Hub is now officially past its sell by date...

Am I glad I left DataMirror in 2001, and so didn't get bought again 6 years later? On balance, I think so. But I know some who have stayed all along - will they stay with IBM too?

Monday, July 09, 2007

Iona FUSEs Celtix and LogicBlaze

Iona says it is expanding customers' open SOA choices by bringing together under the FUSE banner products that were formerly part of Celtix ESB, or were acquired with the LogicBlaze purchase this April.

CBR points out that Iona seems to be favouring LogicBlaze supported components (Apache projects ActiveMQ, ServiceMix and Camel) over Celtix components (apart from CXF, which has moved over from ObjectWeb to Apache). The LogicBlaze purchase must be working out well, methinks...

Friday, July 06, 2007

Talend - another open source ETL tool

Bloor's Philip Howard writes at IT-Director about Talend Open Studio - unusual among ETL solutions in that it is a code generator (of Java, SQL, Perl) rather than an "engine" type of product.

As well as the usual drag-n-drop transformation GUI, this apparently supports business process modelling - which gives Talend a feature that many "real" ETL/EAI tools don't have. There's also support for using a server grid to parallelise processing, and there's an "on demand" SaaS offering.

Version 2.0 is now available, as well as a 2.1 release candidate which is said to add features including:

  • further optimizations for performance increase
  • support of new databases (including bulk load)
  • transaction management (connection sharing, commit and rollback)
  • Slowly Changing Dimensions support
  • MOM (Message Oriented Middleware) support for real-time integration jobs
  • fuzzy logic data matching (using the Levenshtein and metaphone algorithms)
  • normalization, denormalization and flow merge
  • support of SSH remote connections
  • support of PGP file decryption (through GPG binary)
  • reinforced support of the XML standard: DTD validation, XSD, XSLT transformation, significant improvements in hierarchical XML file generation, support of the XMLRPC Web Services protocol…
  • improvements to the tMap component, to support input filters and new joins types such (the cartesian product, first match, last match…)

Sounds like it might be worth a closer look...

Zoning in to Sonos

We've finally bitten the bullet and shelled out for a 2 zone Sonos music system. I see we're not alone; Rob Levy, BEA's CTO, says it is his favourite gadget - and Joel Spolsky has also endorsed it.

We started around Christmas buying 2 Squeezeboxes - because I'm cheap, and because the Sonos system needs at least one box on wired Ethernet - which we didn't have handy in the right rooms. Although the Squeezebox looks good, it didn't really deliver as I had hoped.

The worst things about the Squeezebox:

  • it's not very good at coping with poor wireless reception - which probably accounts for most of the other problems

  • although you can synchronise two boxes (say one in the kitchen, and one in the living room) they tend to drift apart; not a good sound!

  • The slimserver software is incredibly slow at (re)indexing

  • the whole solution relies on slimserver software installed (by an importer) on a linux NAS fileserver - that makes getting updates much harder.

  • Internet radio was listed, but I could never get it to work

  • Documentation and help was rather flaky

  • The boxes look great but the UI is crummy - I never managed to explain it to my wife (or even to myself...)




So we cut our losses and invested in two Sonos ZP80s - they're quite small boxes, which feed into your existing amplifiers (you can use ZP100 if you want a built in amp). Rather than recable our house, we got a couple of powernet adapters. Music sits on a (simple) NAS disc - no intelligence required there.

We had some difficulty getting the powernet working - in the end after three hours trying every feasible combination of sockets in our (1st floor) office and (ground floor) living room, the very helpful installer swapped the original Netgear products for Microlink and found a pair of sockets that worked. Fantastic! after that everything was very straightforward. There's a (pricey) i-Pod like controller - or you can use a desktop interface. Once you've sussed the Zones and Music buttons, it's a piece of cake to navigate.

It's using the same music library as Windows Media Player - so I can load music once, and use it either on my headphones from the laptop, or on the Sonos system. Having the same music in two zones is easy - and very reliable. Adding new zones is also very simple (I just know we'll need a third zone in the office...). And internet radio works a treat - though with a slightly robotic quality, from the selection of sites supported (plenty of UK ones - helpful for us as we get almost nothing on DAB or FM).

The Squeezeboxes won't be entirely wasted; they'll go out to Italy this summer, where they should benefit - eventually - from being wired in (assuming the builders have remembered the CAT-5...).

Sunday, July 01, 2007

Graphic example

Thanks to Oracle AppsLab for referencing this stunning presentation by Hans Rosling. Not just for the superbly informative animated graphics, but also for the moving and thoughtful content (and the sting in the tail - watch all 19 minutes for the climax).

Friday, June 29, 2007

Could open source BI close out incumbents?

Bloor's Philip Howard writes in The Register that Jaspersoft 2.0 is now a serious threat to mainstream BI (and ETL) vendors; a full enterprise license for $35k will knock spots off most of the competition.

Perhaps that helps to explain why the fizz has gone out of Cognos's recent results.

And it doesn't help that according to a survey sponsored by Sybase and reported by CBR, only 13% of UK BI projects work as expected. So naturally "31% of firms surveyed, particularly small and medium sized businesses, [are] keeping an especially close eye on keeping costs down to a manageable level."

Thursday, June 28, 2007

To stakeholder (verb, transitive)

We've all done it at some point in our scribblings; we use horrible new words just because we know that you know what we really mean.

One particularly fashionable word at the moment is stakeholder. It's a useful short-hand for a person or group that has an investment, share, or interest in something, as a business or industry - and which can include employees, customers, the local community - and (if necessary) Uncle Tom Cobbley and all.

Here's a new usage I found in a consultant's report recently:

    Customers should ideally not be adversely affected ... and if downstream impacts are identified the customer should be stakeholdered ahead of time and those impacts managed accordingly.


From the context, I think the author means that someone should explain what's happening to the customer, and manage his (understandable) wrath. But rather than being good for the customer, being stakeholdered sounds more like getting out the garlic and the silver stake. Hold your head in shame, anonymous author!

Monday, June 25, 2007

IBM loses 80% of its Informix customer base

Computer Business Review reports that IBM admits to losing 4 out of every 5 customers for the Informix database it bought back in 2001. That's 80,000 gone out of 100,000.

Where did they go - to DB2 (in which case IBM is happy, even if the Informix rump isn't), or to Oracle, MS or the open source competition?

There's a contrary opinion here from ComputerWorld. Informix may have been "IBM's dirty little secret" but CW believes that the recent "Cheetah" release, improved marketing and changes to senior management in IBM will add up to a better future. Oh, and those 20,000 customers turn into 20,000 members of the Informix User Group - so there may be more out there that CBR didn't count.

Monday, June 18, 2007

IBM acquires Telelogic; a Rational move?

Interesting. IBM just announced the acquisition of Telelogic for around $745M. Telelogic is a Swedish company specialising in software configuration management; obviously their products will come under the Rational brand within IBM.

I spent much of the the last year as a Telelogic CM/Synergy user - having previously been more familiar with IBM Rational's own ClearCase product. They each have their strengths and weaknesses; but both suffer most from the naivety of their users and administrators. I've now worked on several sites who have proudly handed over some seriously enterprisey sum of money to the vendor, and then used one or other of these products when CVS or Subversion would have worked just as well (or maybe better). Why buy an all-singing, all-dancing tool, and then tie its hands behind its back?

Anyway, read what the Register says here and here.

Friday, June 15, 2007

The new house is coming on nicely

Here's a view of our new venture Casa dei due Mori - the house of the two mulberries - under construction. We're certainly looking forward to spending lots of time there this summer when it should be finished. It's been a long trek to get this far - nearly four years from the thought to the finished article. I think it will be well worth the wait.

Ciao!

Posted by Picasa

Tuesday, June 12, 2007

Buffered LOB_APPEND

Jens Ulrik asked to see the code for a buffered LOB_APPEND mentioned in an earlier post LOB_APPEND eats resources....

Here it is as part of a package. Apologies for the rush job at the time; inelegant, but it was worth it...



gi_chunk_length integer := 0;
gv_chunk varchar2(32767);

-- add a character value to the CLOB being built up; catch null values
-- NOTE - final flush is done elsewhere
-- and we assume max(length(av_value)) < ki_maxchunksize

PROCEDURE lob_append (at_doc_clob IN OUT NOCOPY CLOB, av_value IN VARCHAR2)
IS
li_value_length integer;
li_new_chunk_length integer;
ki_maxchunksize constant integer := 32000;
BEGIN
-- prt_common_pkg.print_xml_string(av_value);
IF av_value IS NOT NULL THEN
li_value_length := length(av_value);
li_new_chunk_length := gi_chunk_length + li_value_length;

if li_new_chunk_length >= ki_maxchunksize
then
-- pad out the chunk as much as possible
gv_chunk := gv_chunk || substr(av_value,1,ki_maxchunksize - gi_chunk_length);
-- write out
DBMS_LOB.append (at_doc_clob, gv_chunk);
-- initialise remainder into next chunk
gi_chunk_length := li_new_chunk_length - ki_maxchunksize;
gv_chunk := substr(av_value,li_value_length +1 - gi_chunk_length);
else
gv_chunk := gv_chunk || av_value;
gi_chunk_length := li_new_chunk_length;
end if;
ELSE
prt_common_pkg.LOG ('lob_append null ignored',
prt_common_pkg.klog_ret
);
END IF;
END;

The final flush of the last chunk of a LOB is called like this:


IF gi_chunk_length > 0 THEN
-- flush out the last chunk of the CLOB
dbms_lob.append(at_doc_clob, gv_chunk);
gi_chunk_length := 0;
gv_chunk := null;
END IF;

ETI offers data integration projects as a built-to-order service

(Disclosure: I worked for Constellar, a competitor, from 1995 to 2001, and worked with - or perhaps I should say against - ETI*Extract at an Oracle customer site as long ago as 1994).

Bloor's Philip Howard reports that ETI is offering a 'built to order' service for data migration / integration, based on ETI Solution (formerly ETI*Extract). Just enter your requirements into their integration portal, or send them a Word document, and they claim to be able to deliver a generated solution at an average price of $25,000 - and they claim to guarantee a delivery time (usually just 2-4 weeks). For more info, see ETI Built To Order Integration.

In my opinion, ETI's products suffered poor sales in the past from being pricey and (compared to products like Informatica) difficult to use. The template-based generation (by no means unique to ETI - Constellar took a similar approach, as does Oracle Data Integrator) was too hard for most clients to customise. However, the fact that it is a pure generator (there's no runtime engine) helped ETI establish a niche supporting all kinds of legacy/mainframe data sources and targets, and now makes it possible for them to deliver generated code without leaving their IP behind.

Offering BTO seems like a good way to exploit the tool's underlying capabilities, and the skills ETI's own consultants have accumulated, without having to make the tool itself more (dare I say) user friendly.

One question - most integration projects don't know what their requirements really are until they start development and testing. How will this service deal with iterative development? What about late 'clarifications' - "oh, did I mention that the NOTES field got used to distinguish customer types during the early 80s; in the 90s it was used to hold foreign currency information for certain services..."?

So now let's see how this works in real life!

Saturday, June 09, 2007

Open Source BI and ETL - picking up pace?

A couple of recent stories to ponder on: SnapLogic raises $2.5M (admittedly from its founder's very own VC boutique) to build its LAMP based data services, while according to CBR, Ingres is preparing a Jaspersoft based BI/ETL appliance.

Meanwhile Pentaho has released its Data Integration 2.5 (formerly known as Kettle) and is showing at ODTUG what it calls "the world's most popular open source BI suite".

Thursday, June 07, 2007

How to build a calculator - not!

The wonderful Worse Than Failure (formerly wtf.com) has been running The Olympiad of Misguided Geeks at Worse Than Failure Programming Contest. It's an inspired idea: just how bad can you make a deliverable, while at the same time meeting the requirement.

Alex set contestants the task of building a calculator. Responses have included:

  • Driving it all through optical character recognition - my personal favourite so far

  • collecting inputs and simply sending them to CALC.EXE to do the work

  • one built on an extensible framework that allows you to add new digits. You want to add apples and oranges? well now you can!


All too good to miss - you can find them here, so enjoy!

Sunday, May 20, 2007

Open source BI vendors get busy

A useful article from Computer Business Review. Has anyone got any experience with Pentaho or Jaspersoft data integration tools being used in anger that they'd like to share?

Saturday, May 19, 2007

Celona Evolve - "progressive integration" competitor for Oracle Data Integrator?

A piece of work I've been doing has involved taking a cursory look at Celona Evolve, though sadly I haven't been in a position actually to give it a go.

Coincidentally, this week Bloor's Philip Howard has written about it at IT-Director.

For the Oracle-ite, it's positioned in a similar market segment to Oracle Data Integrator (formerly Sunopsis); providing traditional ETL but also talking up continuous (or "progressive") integration (EAI / SOA / EII - take your pick of acronyms).

So far, successful references for Celona are hard to find - always a problem for startups. And it's a UK company; my personal experience says that tends to be a disadvantage too. You may get a few good early adopters over here, but going to the States can be a killer. At present the product seems to rely heavily on Celona's accompanying services; they won't be able to grow properly until either the product is easier to use, or there's a large independent base of Celona skills in the market (as there is for other tools like Informatica, OWB, etc).

They're obviously gearing up for a more public stance; they "launched" on 8th May (hence the analyst briefing). Good luck, we'll see how things progress

Tuesday, May 08, 2007

Is big software now like big pharma?

An interesting summary at Barrons of a VC panel at the Software 2007 conference. The worry is that big companies like MSFT, ORCL and IBM have so much CIO mindshare that there's none left for the little guys; the richer get richer and the poor (or small) are shut out.

Even the SaaS players trailing behind Salesforce are tiny; as Ted Schlein says “They are all gnats. They are all tiny. Not one would be a small division in one of these larger software companies.” (Actually, even Salesforce's (Q407) revenue is just $144M; our Larry could buy it from petty cash...).

Moving swiftly on

I've reached the end of a nine month engagement, where I've been helping to maintain a 12 year old system; originally developed I suspect on Oracle7, and still running on Forms/Reports 6i. It's been interesting mainly as an example of how to start to rebuild a team's understanding of a system a team (mainly contractors) has been churned, original documentation is hopelessly outdated, and the institution's understanding of its own systems has been undermined.

We started to replace a LAN full of (flakey, unindexed, heavyweight) documents with a wiki-based repository of (current, hyperlinked, searchable, lightweight) 'bits of knowledge' that can be combined in various useful ways. One or two other groups at the client had already started to use PmWiki, so we built on that. When I get a chance I'll post specifically on how I found PmWiki as a tool, and how it compares with other wikis (TWiki and MoinMoin) that I've used in the past.

Now I've started with a new client, where I am advancing my career by working on a data migration study for a 20+ year old system, based on a pre-relational database, whose name of course I can't mention here. The good news is that the 'new technology stack' is firmly Oracle based; but the most interesting part will be dealing with the political and operational issues involved in migrating what are literally the core systems for this business.

On the way, there should be some interesting comparisons of EAI/ETL/EII tools; I hope we see Oracle Data Integrator put through its paces on the way. Certainly I should be able to post more on 'integration' generally for the next few months.

Wednesday, May 02, 2007

Sonic's Dave Chappell joins Oracle

Oracle must be well pleased to pick up Dave Chappell, formerly a VP and chief evangelist at Progress Software's Sonic division (he's still on the management team there, according to this link). When that gets taken down, you can find his O'Reilly author profile here.

Dave will be development VP for SOA/ESB. I wonder which products that includes - Oracle Data Integrator perhaps? Good luck Dave, enjoy the ride at Oracle - I hope we meet again at another JavaOne some day.

PS, don't confuse him with any other Dave Chappell (like this one who runs an eponymous IT consultancy and speaks (for example) on BPEL, or this one who (I think it's safe to say) has no IT connection at all.

Java Caching for Oracle Apps 11i

Steve Chan's article is about the use of caching interesting; now how long will it take for Oracle to put him together with the guys from Tangosol that they acquired in March?

Who knows, they could even finish off JSR 107: JCACHE - Java Temporary Caching API which Oracle submitted in 2001, and for which Tangosol's Cameron Purdy has been tech lead since sometime around 2002/3. Heck, most of the companies listed in the expert group have gone to the retirement home long since (names like Gemstone, Blusestone, iPlanet and SeeBeyond). Not bad for a project that was expected to take 12 weeks...

Toodle pip!

Thursday, April 26, 2007

Joins over histograms

Here's an awesome paper from Alberto Dell'Era and Wolfgang Breitling looking at "the formula used by the Cost Based Optimizer to estimate the cardinality of a [single-column] equijoin, when both the columns referenced in the join predicate have histograms collected."

Awesome because of the thought given to the presentation, and the care taken to test the results. Alberto has followed the lead taken by Jonathan Lewis's CBO book to carefully deconstruct and reverse engineer the algorithms used - with one or two surprising results.

Well worth a read...

Friday, April 13, 2007

Iona picks up LogicBlaze and its open source SOA platform

The team at LogicBlaze has been putting together open source systems including Apache ActiveMQ (a JMS implementation), Apache ServiceMix (an ESB) and LogicBlaze FUSE (their own SOA platform).

Iona and LogicBlaze have been working together for a while, so there shouldn't be too many integration problems.

I wish the team, especially James Strachan and Robert Davies, former founders of (and my colleagues at) SpiritSoft, good luck in their new home. It's interesting that Iona - which contributed several to the Spiritsoft team around the turn of the millenium - should be their next resting place.

Friday, April 06, 2007

WebMethods sells itself to Software AG

I said a couple of months ago that webMethods needed to shake up a bit. Well, now they've sold out to Software AG - originally best known in the 1980s for its Adabas database and Natural reporting, more recently a player in the XML / SOA world with Tamino and EntireX.

WebMethods was founded in 1996 and its IPO in 200 marked a high point of the EAI boom; an offer price of $35.00 turned into an opening of $195 and a first day close of $212. The price reached over $300 quite quickly, but since then has trended fairly steadily down to today's sub $10. Not a good investment overall!

Software AG wants this to be "a major step in [their] plans to more than double revenue to EUR 1 billion (USD $1.3 billion)." WebMethods' $200 odd million of annual revenue is a big - if barely profitable - step in that direction.

An article at Barrons notes that the share price is now trading above the offer - a sign that some in the market are expecting a higher bid (the analyst doesn't think they will be in luck).

The deal has pushed up Tibco's price too - on hopes that they will be the next target, perhaps.

Saturday, March 24, 2007

Cache in the bank - Oracle buys Tangosol

Oracle has picked up Tangosol - and its caching product Coherence - for the traditional undisclosed sum.

Well done Cameron Purdy. For a couple of years I worked at SpiritSoft, and we competed with Tangosol. Our engineers always said we had the better architecture - but Tangosol whipped our ass good anyhow. Cameron's team delivered a robust product and made it easy for people to buy - and the fact that they now claim 1500 implementations is testimony to lots of hard work all round.

We competed around the JCache JSR 107, which must be the slowest JSR to get to a review draft (and ironically, it was originally sponsored by Oracle). This 2002 TSS thread was already talking about excessive delays; since then very little seems to have happened - mind you the JCP (Java Community Process) site is having an off day, so perhaps it came out while I wasn't looking.

Soon after that TSS exchange Cameron joined the JSR 107 expert group, and IIRC he became the spec lead. But still no delivery. Perhaps now Oracle will push the standard some more? Anyway, all that competition is water long under the bridge, so let's recognise a winner, and as Cameron would say:

Peace

Thursday, March 22, 2007

John Backus - father of FORTRAN - RIP

As well as being the father of the first high level language suitable for numerical work - and of Algol 60 which can be taken as the root of most of today's widely used languages like C, C++ and Java - John Backus lives on as coauthor of Backus Naur Form (BNF) which we still use in the Oracle docs - for example:

relational_table ::= CREATE [GLOBAL TEMPORARY] TABLE [schema.]table
[(relational_properties)]
[ON COMMIT {DELETE | PRESERVE} ROWS]
physical_properties
table_properties;

Monday, March 12, 2007

Vitria is now private; Iona buys C24; Informatica Integration on demand

As I noted in October, Vitria is being taken private by its founders. Shareholders have now approved, and the transaction closed on 7th March.

This marks the end of a rollercoaster ride on NASDAQ (and by rollercoaster I mean there was a sharp climb at the beginning, but after a series of humps, bumps and loops you end up right back on the ground). Vitria's results over the last three years show everything gradually declining - revenue, assets, license sales. One positive - losses have also been reduced. Can Chang and Skeen turn the ship around, or is this just another step towards the sunset retirement home for distressed software companies?

Meanwhile Iona has picked up (London) City specialist integration boutique C24 for an undisclosed price. Given C24's small size - 12 employees - there should be no major digestion problem for Iona as long as C24's customers are kept sweet. Good luck to the C24 guys, a couple of whom I met while I was at SpiritSoft.

Finally, Informatica has launched the "first and only" on-demand data integration service. The "On Demand Data Replicator" - yours free for 30 days, and $1500/month from then on - is initially aimed at Salesforce.com customers; SaaS vendors like RightNow and NetSuite are next in line. The integration is (I guess) intended to be from your internal apps to your hosted apps, and vice versa.

Thursday, March 01, 2007

Oracle captures Hyperion - a well planned thrust at SAP?

Lots of buzz today about Orace's rumoured - then confirmed - acquisition of Hyperion (formerly known as, and still largely known for, the eponymous Essbase multi-dimensional database). Oracle bloggers like Mark Rittman have concentrated on the BI side of things, but it is worth remembering that Hyperion has also assembled a set of financial applications (mainly planning and modelling, as you’d expect) - not to mention a BPM product line. That could be very interesting as an add-on to Oracle Apps (not to mention PeopleSoft and Siebel).

Looking at their latest Q2 results, there’s no breakdown of revenue between product lines - but it’s interesting that the headcount is 1745 in Americas, 632 in EMEA and only 212 in APAC. Oracle’s wider/deeper international network could give a big boost to sales in Asia (as well as making admin savings at home). They'll keep the salesforce but dump the top-heavy administration.

The financial market seems reasonably positive about the news (see for example Barron's Eric Savitz. As well as providing a sell-up for Oracle Apps, Peoplesoft and Siebel, this can also get Oracle's foot further into the door at SAP sites.

    "Hyperion is the latest move in our strategy to expand Oracle’s offerings to SAP customers,” said Oracle President Charles Phillips. "... Now Oracle's Hyperion software will be the lens through which SAP's most important customers view and analyze their underlying SAP ERP data."

There is also a feeling that Oracle got a good price, catching Hyperion with its share price down. The impact on other BI players like Business Objects and Cognos is mixed: M&A activity might be expected to push up their price - but their share prices already factored in a bit of a punt on Larry Ellison coming round to tea; now he's spent his money on Hyperion, the others may well fall back.

Good luck to the Oracle BPM, BI and Apps product managers trying to make sense of it all!

PS: Another good take here from Curt Monash at DBMS2

Wednesday, February 28, 2007

WebMethods/Infravio X-Registry - free 45 day trial download available

Back in December, I asked would you pay $99000 for a starter pack.

Well, it looks like everyone else must have agreed, and not paid up. So webMethods has now announced a free trial download which "can be deployed for 45 days of testing and evaluation within a non-production environment".

After their recent results (see yesterday's post) I guess someone in marketing has decided it's about time to shake things up and create a few more leads for the salesforce to work on.

Tuesday, February 27, 2007

EAI results - a mixed bag

Back to looking at financials, a note on Motley Fool that touts Tibco as a "king of cash" reminded me to poke around some of the EAI vendors' financial results.

Tibco's Q4 (ending November 06) was pretty spectacular, with revenue up 20% to $160 million - license revenue growth being a very healthy 32% to $88 million. But earnings over the entire year were barely up; Tibco seems to have a pattern of a huge Q4 after flat Q1/2/3.

WebMethods earnings show license income down more than 10% in Q3 to $19.7 million (total revenues $53.1 million). Motley Fool's article New Product, Same Problem suggests that digesting recent acquisitions (eg Infravio) and restructuring the salesforce are affecting sales.

Vitria's results show a startling spike in license revenue up to $6.7 million up from $1.8 million in the same quarter to Dec 2005 - around half of that accounted for by two customers. Encouragingly, Vitria is in the black.

Finally, BEA takes a bit of a beating - shares down 10% even though revenues were up 15% on same quarter last year. Why? because forecasts are down for next quarter (the analysts wanted $385 million, but management expects only $350-364 million).

A Morningstar analyst rounds it all off in the same news item by suggesting that BEA will get tough competition in the SOA space from Oracle, IBM and Tibco.

Monday, February 26, 2007

Unique IDs for multi-master replication - Sequence or SYS_GUID?

Oracle-L is proving to be a good source of inspiration at the moment. Oracle ACE Syed Jaffar Hussain asked the question Is it possible to share a database sequence between multiple databases?

Using remote sequences
A couple of the replies took the question a bit too literally, and said yes, you can define a sequence on one database, and use it from another:

Select SequenceOwner.RemoteSeqName.NextVal@DBLinkName From dual


This works - and it can be made 'location transparent' by creating a local synonym for the remote sequence. But it is asymmetric (one database has to own the sequence) and it introduces a point of failure. If database 1 owns the sequence, database 2 can only insert rows if database 1 is actually available. If you could guarantee availability, why would you bother with replication?

Using a sequence local to each database
Several replies (including mine) suggested using carefully defined sequences which will deliver discrete values on two or more master databases. There are two basic patterns:

Partitioned: Suggested by several posters, the number space is divided up in blocks:

on db 1: create sequence myseq start with 1000000 max 1999999
on db 2: create sequence myseq start with 2000000 max 2999999


A variant on this is to use positive numbers for database 1, and negative numbers for database 2.

Interleaved: The more popular option is to use the old trick of assigning odd numbers to database 1, and even numbers to database 2:

on db 1: create sequence myseq start with 1 increment by 2
on db 2: create sequence myseq start with 2 increment by 2

This mechanism is much easier to manage (essentially, it doesn't need any further management). It is also easy to extend to 3, 4, 27 or 127 masters - just set the "start with" to the database number, and "increment by" to the maximum anticipated number of databases required.

A third option was also proposed by Mark D Powell: use SYS_GUID(). That has some disadvantages:
  • SYS_GUID() is bigger (16 bytes on 9iR2) than a NUMBER (eg a 13 digit integer needs around 8 bytes). Obviously the extra space follows through to indexes, foreign keys etc.

  • Another is simply that it is a RAW, which has some possibly undesirable implications; for example several common tools (including SQL*Developer 1.0) can't directly display RAW values; you have to explicitly select RAWTOHEX(id_column).



On the other hand, SYS_GUID can be defined as the default value for a column - unlike NEXTVAL which is normally set in a trigger or directly as part of an insert into/select from statement. Worse, it is very common to see a separate SELECT seq.NEXTVAL FROM DUAL for every ID generated. I've never investigated the relative performance of SYS_GUID() against getting a sequence number - anyone else like to share that? That may well be the most important consideration of all.

So I think I'll be coming back to this subject in future.

Update - later the same day:

I've quickly timed a million iterations of select sys_guid() from dual, and a million iterations of select sequence.nextval from dual (Oracle XE 10.2.0.1, HP dv1665 Centrino Duo running Windows XP)

select sys_guid() into variable from dual: 89 seconds
select seq.nextval into variable from dual: 40 seconds

For good measure, I've added a variant - one million calls of sys_guid() without a select:

variable := sys_guid() : 95 seconds

One of the potential advantages of sys_guid() I had anticipated is that it could be called directly from PL/SQL - but it looks like the implementation is less efficient than it might be; tracing shows that the PL/SQL function recursively selects SYS_GUID() from dual.

So I think I'll stick to traditional and more convenient sequence.nextval for now.

Chin chin!

Sunday, February 25, 2007

Avoiding application suicide by session pool

A recent post on Oracle-L mentioned:
    We had an issue affecting one of our production DBs. The middle tier application for some reason, went crazy spawning processes chewing up the DBs process parameter. DB started throwing errors indicating max process exceeded.


He went on to ask how to wake up PMON to clean up these processes - he'd had to bounce the database to get things cleaned up.

My response was to ask about the root cause, rather than the symptom. The poster may have been solving the wrong problem (or rather, after putting out the fire, he needed to find the cause and stop it happening again).

"Mid tier went crazy spawning processes" is often a symptom of session pool
madness. In such an application, X number of users share a smaller Y number of Oracle sessions. Everything tootles along happily; Users (midtier threads if you like) loop around:

  • get a session from the pool

  • issue one or two SQL

  • commit/rollback

  • give the session back


As long as the users spend less time in Oracle than they do in the rest of the
application (and waiting for user input etc), no problem.

Then something goes wrong; maybe a session sits on a lock that everyone needs; maybe a sequence cache isn't big enough (or is ordeed) and/or you forgot that We Don't Use RAC; maybe you had an SGA problem like ORA-4031.

What happens next:

  • all the Oracle sessions in the pool are busy

  • next midtier thread asks for an Oracle session

  • midtier pool manager says "no problem", launches a new Oracle session and adds it to the pool

  • that session becomes busy

  • and the next thread, and the next thread, and the next thread...


Soon instead of sharing say 100 Oracle sessions across 1000 processing threads,
your mid tier has responded to the blockage by adding 900 new sessions to the
load. That's probably made the problem worse, not better - kind of like slamming your foot on the accelerator when you see brakelights ahead in the fog.

I had exactly this problem performance last year, testing a J2EE app, using OC4J. We hit a 4031 problem (no bind variables in one part of the system) and then fairly immediately the application server did its lemming impersonation as described above.

Things to consider:
1) reduce the upper limit on the session pool size (definitely to below your Oracle processes level!)
2) if possible, slow down the rate of session starts (eg set a delay in the mid-tier session manager)
3) find out what caused the problem in the first case.

The good news is that if you dampen down this suicidal behaviour, you probably have a better chance of diagnosing the root cause next time.

Wednesday, February 21, 2007

Data warehouse appliance pros and cons

Andy Hayler writes here about the marketing hype surrounding data warehouse appliances like Netezza, and warns that buyers should not just consider the 20% of a project budget that is typically spent on hardware and software. The other 80% goes on - well, people like us.

However the economic case for the Netezza model can still be made - rather easily. I’m aware of a POC where the capital costs of a 50TB Netezza solution are considerably less than the business is being charged back each year just for the cpus and discs supporting the equivalent Oracle DW. The clincher is that Netezza doesn’t require the same level of DBA expertise - it just hasn’t got any tuning knobs for us to fiddle with. Oh, and critical queries run up to 150 times faster ...

If these appliances result in a hefty capital saving and a substantial reduction in DBA/performance tuning overheads, then they will certainly continue to gain market share.

Wednesday, February 14, 2007

Google Oracle - Good start, now what about Metalink?

The news is out - here from Eye on Oracle - that Oracle has finally opened the door to Google indexing of Oracle's documentation.

That's marvellous, but it's not exactly the fall of the Berlin wall. Many Oracle docs have been online and available for Google search for years - certainly for the database. Some of these have been (accidentally?) made available on customer websites (eg http://www.lc.leidenuniv.nl/awcourse/oracle/nav/docindex.htm - it's often universities) and others by Oracle themselves (eg http://download-east.oracle.com/docs/cd/B14117_01/nav/portal_3.htm). Just Google for key phrases like Browse the list of books, including PDF for printing for 9i or ADM link goes to the Administrator's Guide which gets you the 10g book list - and you'll see what I mean. I've included the 'repeat the search with omitted the results included' option.

What would be really useful is to open up Metalink itself (only customer SRs excepted). Too many Oracle developers need access, but can't get it because of short sighted officiousness. We developers (especially freelancers like me) often get the short straw - we're expected to be have all Oracle knowledge at our fingertips and yet we are effectively prevented from using one of the best available resources for it. Our employers just won't give us access to the support contract information necessary to get connected. BTW, I stress that the main fault is not Oracle's; it lies more with jobsworths (Brit expression I think - those people who say "it's more than my job's worth" as an excuse for anything).

Oracle Corp can and has made valid arguments for secrecy - mainly around its intellectual property rights and its competitors. But anyone who wants to indulge in industrial espionage can probably afford to buy a support license and get into Metalink anyway. The net effect is that the security defeats exactly the wrong group.

So go on Redwood - open up a bit more; you know you want to!

Monday, February 12, 2007

Schema models, FK constraints and Oracle Apps - a matter of entropy?

There's been an interesting series of posts on Oracle-L recently, which started with a request for a schema model of Oracle Apps. Jared Still posted this response, which included the observation:

Oracle Apps - Not quite as sure about it, but I believe its origins
predate the use of referential integrity in the database.



I worked on Oracle (UK) Accounting v3 during mid-late 80s, some ideas from which (code combinations for example) were 'borrowed' for what became Oracle Financials. AIRC that was around 1987-8. HR development started a little later, and being UK based was a lot more CASE (Oracle Designer) inclined (they shared the same building). Manufacturing originated in the US consulting organisation, and again they were somewhat more methodological than the original Apps team in Redwood. In both cases I think initial development still pre-dated the implementation of effective foreign keys in Oracle 7. For Financials, even the option of FKs in the dictionary was not yet on the menu.

The (defensible) strategy to 'disintegrate' the applications (eg having separate GL, AP, AR modules in Financials) made it easier to get early releases out of the door - but at the cost of hiding relationships. And of course the whole Application Foundation ethos of configurable code combinations and flexfields means that many relationships are impossible to implement as Oracle serverside constraints out of the box.

Since then, I guess the normal product development priorities have reigned: functionality/saleability first, customer bug fixes second, and engineering/non-functional improvements last. Performance fixing priority goes up and down the scale according to the level of pain being felt by critical customers (Cary Millsap will remember performance testing of release 9, for example).

Just try to explain to a VP of Applications that you want to spend (managers prefer 'invest') tens of man-years documenting and 'improving' internals. It's like painting the Golden Gate bridge - once you start, you never stop. That budget has to come from somewhere - and the other priorities always seem more attractive. So there is a tendency to maximise entropy (btw that's one of the main reasons why startups can beat gorillas).

All the large ERPs seem to suffer the same problems - it's not just Oracle. But as one poster said - the more gotchas there are in the Apps, the more work for us... in the short term at least.

Happy obfuscating!

Tuesday, February 06, 2007

The more the meta - 'Intentional Software' meets reality?

This quite long Technology Review article catches up with Charles Simonyi - Microsoft's former Chief Architect and the alleged inventor of Hungarian variable naming convention - who will be rocketing up to the International Space Station in April.

In his day job, he has been developing a 'new' concept now known as 'intentional software'; for every problem domain, developers will create generic tools which users can employ to "guide the software's future evolution".

Code generators, CASE tools and IDEs could all be seen as primitive examples of what he's on about. They also point to the possible fly in the ointment. As you approach the problem boundaries, any preconceived tool inevitably becomes increasingly inefficient; and the abstraction often makes it difficult or impossible for the user/developer to interfere productively. This is the law of leaky abstractions.

That's not to say that domain-specific generators are a bad thing per se; the template driven generator underlying Constellar Hub was a key component in its success, allowing us to quickly improve and extend the range of objects it could generate. When the tool runs out of steam, code developers can simply jump out of the tool and modify the generated code (sadly, that seems to be the permanent situation for most J2EE - more code and less declarative than any 1980s 4GL) or use a different abstraction.

But as far as Simonyi's work goes, you can count me in with the sceptics. Changing the representation of a problem (from textual code to a graphical parse tree, for example) does not in itself make the problem easier to solve. I saw my first 'visual' coding system in about 1981 - on a VT100 at that - and it was both brilliant and (without a mouse) completely **** useless at the same time. Still, it will be fun watching him throw all his money at it...

Ciao!

Sunday, January 14, 2007

Kalido in a box?

Bloor's Philip Howard writes in IT-Director.com that Kalido gets a competitor.

The problem Kalido addresses is dealing with change in a datawarehouse; for nearly 10 years Kalido has been the only company trying to The new kid on the block aiming to manage changing/evolving business models without having to completely rebuild your datawarehouse is BIReady, a small Dutch company (hmm, Kalido came out of Royal Dutch Shell ... do those Netherlanders know something we don't?) which claims to deliver 'fully model driven' data warehouse with support for history of change.

This is an area which has traditionally been dealt with by large numbers of staff and consultants. Kalido has managed to make some sales in large multi-national companies who value the benefits of managing not just evolving models, but multiple concurrently active models. BIReady looks too small to bite off these very large organisations (like Shell, Unilever etc) - but they may help to commoditise this part of the DW market, and perhaps make people think about model-driven warehousing further down the food chain.

One to watch.

Saturday, January 06, 2007

Listen to the music

I love listening to music. At work, I use my own laptop like a (rather large) iPod - its a great way to zone out of the open plan hubbub. Currently I'm just about to hit 6000 tracks, which I mostly have on permanent shuffle. We're looking into sound systems for our house in Italy and although I think the Sonos system (described here by Joel Spolsky) looks fab, I also tripped over the Slim Devices Squeezebox and wanted to have a go with that first, as it's a lot cheaper.

Essentially, we've moved our music collection onto a network attached file server (from Qnap, which conveniently comes with the Slimserver software pre-installed on Linux). You can continue to rip CDs using Windows Media Player (just pointing it at the new network directory).

Slimserver indexes the collection (very slowly - what's that all about?) and then you're ready to go.

Setting up the Squeezebox was easy - apart from revealing yet again what a poor wireless network we have. Eventually I convinced myself to replace the ADSL/wireless router, replacing a Belkin which has never been very reliable with a Netgear Rangemax, which was a doddle to set up and so far seems solid as a rock. The Squeezebox is in the living room, plugged into the home cinema's spare input (DVD, Sky+ and now this).

The Squeezebox looks fabulous - sleek and white and not too small (or large). The music reproduction is fine as far as my aging ears nnd cheap home cinema speakers can tell; and I easily can re-rip my CDs at a higher bit rate. However needing a separate remote is a bit of a pain (that now makes FIVE remotes to lose). That's where Sonos wins - they have a beautiful (massively expensive) wireless remote that can control all your Sonos Zone Players using an iPod like panel. The Squeezebox has a complete phonepad of function keys, and the interaction with the display is not entirely intuitive.

Anyway, it all works fine (as long as we balance the Squeezebox on a box of chocolates - it must be in the worst place in the house for wireless, as far as possible from the base station and practically behind the TV). So now I have to work out how to build playlists, and then decide if I want to fork out double or treble for Sonos at the house in Italy.

Choices, choices!