Databases are like cars and trucks:
there isn't one choice that's
right for everyone.
is occasionally asked what the best database is.
This is a bit like asking, what is the ideal car?
The first question is "what do you want it for?"
and the second "what is the budget"?
Some factors worth considering are:
- Hard constraints from the target application
(many database related applications only run on certain databases)
- Local experience and expertise
- Scale of the database (the Mack truck versus mini-scooter question)
- Local versus remote/unattended operation
- Reliability requirements
- Ease-of-use requirements
- Performance requirements
The marketing materials from the various database
vendors often suggest performance is a critical
factor in database selection.
In fact most performance problems can be traced to
- inappropriate or poorly worked out database design
- failure to execute correctly against the design
- insufficient hardware: too little RAM, slow disks, small network pipes and the like
None of these are significantly affected by the underlying speed of the
database.
Arguably the most popular database for small to medium organizations.
The Microsoft version was spun off from a joint venture with Sybase in the 90s.
At first, it was, in all candor, not ready for prime time.
But there has been a great deal of work down with it:
the optimizer is now first rate,
as are the stored procedures.
It does lack a reasonable front end. Apparently Microsoft thinks that the combination of the reporting services
and Access will make up for this.
Or it doesn't want SQL Server to compete with Access, which at this point has little value except as a frontend tool.
The data transformation services (DTS) tools
are good.
(Some shops have acquired SQL Server just for DTS.)
The system administration tools are easy to use.
One can think of SQLite, MySQL, & PostgreSQL as functioning like the different sizes of stringed instruments in an orchestra.
SQLite (as the name suggests) is at the low end, PostgreSQL is at the high end (competing with Oracle), and MySQL is comfortably in the middle.
MySQL is definitely the most popular single open source database.
It is a good fit to a lot of small to medium problems;
there is a wealth of documentation for it;
and it is easy to run.
Some concerns have been expressed over the fact that Oracle has bought it.
There may be some temptation for Oracle to starve MySQL of resources to avoid competition with Oracle.
Several forks of MySQL have been started to make sure that the basic product stays open source.
It is unclear how much traction these forks are getting, however.
PostgreSQL dominates the high end of the open source database community.
It competes with Oracle on speed and is at least as reliable.
The community is large, international, & growing rapidly.
The weakness of PostgreSQL is in the tools: it doesn't have the same rich ecology of supporting frameworks
and tools.
This situation is being addressed & it will be interesting to see how the competition between PostgreSQL and Oracle
plays out.
The 800 pound gorilla of the database market.
Oracle is a perfectly reasonable choice
as a primary database,
although it has some "gotchas" which
make consideration
of alternatives prudent.
In particular,
Oracle is rather harder to administer than some other databases.
In fact,
as one of our colleagues discovered,
it is possible to get an Oracle database so curdled
that it is not possible to run the restore procedures.
has done ports from
Informix to Oracle.
Our experience is that the Oracle cost-based optimizer was not in the same league
as the Informix optimizer;
it needed a great deal more help from nearby humans
if it was to execute queries at a reasonable speed,
given a specific hardware configuration.
Oracle's PL/SQL language has some nice features.
PL/SQL's debugging support was bizarre,
requiring manual intervention by the
DBA for setup.
Oracle's bulk load/unload procedures -- needed for conversions,
synchronization with remote databases, and the like --
are unimpressive.
Oracle's strong suit is its suites: Oracle
has a wealth of applications, including much stuff for the web.
(Apparently this plentitude is partly due to political issues within Oracle:
different factions will construct essentially the same
tool in an attempt to garner favor from Ellison.)
SQLite is a nice fit for many single user problems, i.e. providing an SQL database that fits in a cell phone.
It doesn't provide more advanced features like simultaneous use by multiple users,
but in many contexts you don't need such.
So if you can reduce overhead by omitting them, go for it!
|
The "NoSQL" databases having been getting much more traction lately.
These includes MongoDB, Riak, CouchDB, Redis, and others.
These typically use a simple hashkey-to-blob model.
The simplicity can yield considerable increases in speed and flexibility,
at the cost of some loss of precision and reliability.
If you are serving things like wikipedia pages or large, sloppy documents,
then the NoSQL databases can be a good fit.
You pays your money and you takes your choice.
One approach
is to use a relational database for the high sensitivity core of an application: user account data and purchase history say,
while using a NoSQL database to manage documents & other large objects.
PostgreSQL supplies an "hstore" data type which is intended to give NoSQL type access within a relational framework.
Apparently the speed for hstore is about the same as the speed for a pure NoSQL database.
has worked with Informix since 1987,
starting with version 2 (Informix is now on version 9).
has been particularly happy with
the Informix Standard Engine (SE) product.
This has offered an unequaled
combination of ease of use and performance,
within its weight class
(again, don't build an airline reservation system with this).
In general the Informix line offers a nice combination of ease of administration
and performance,
making
it surprising they do not have more market share than they do.
Their cost-based optimizer is good,
and has only occasionally required hinting.
(Hinting: putting comments in an SQL query to tell the optimizer
how to run the query to achieve best performance.)
The Informix 4GL language -- a tool oriented to the old 24 x 80 screens --
was (and is) quite good.
The Informix stored procedure language (SPL) is fast enough,
but curiously limited in features by current standards:
primitive exception handling for instance,
and relatively few operators.
Informix does not offer a lot of GUI tools,
but good GUI tools are available from third parties.
has built a variety
of Informix databases and applications.
These include
cable advertising,
medical laboratory,
and
financial applications.
built a large
cemetery management database in this.
Ingres has good performance
and a clean design.
found especially attractive
the fact that all of the reports and screens
were fully described within the database itself.
This meant that if some small change was required to a field -- say widening it --
the change could often be applied by doing a single update
to the internal tables,
without need to go the perhaps twenty screens where that field was used.
Ingres was originally an academic product.
This perhaps showed in the error messages,
which would sometimes have been less confusing if written in Greek,
as then no time would have been wasted attempting to decipher them.
While not much present in the market, its spirit lives on in PostgreSQL,
which is in part derived from Ingres
A fast if somewhat strange product,
Omnis is really a hierarchical rather than a relational database.
But it gives best overall performance
if its native hierarchical
features are eschewed in favor of the relational.
Omnis is characterized by a somewhat idiosyncratic
internal language but very good performance.
It has a rich feature set.
As its name suggests,
Filemaker is really a flat file database,
albeit with some relational features,
and the word "relational" featured prominently on the shrinkwrap.
It is very easy to build a simple but attractive front-end
with this tool.
And it is sufficiently simple that
much routine database maintenance
and enhancement
may be done
by non-IT personnel.
4th Dimension
is another effective tool at the low end.
It has very good facilities for constructing
GUI front-ends,
and a rich internal language.
It is not a relational database,
again not withstanding the
statements on the shrinkwrap.
has found 4th Dimension to be a good
tool for building user-friendly one person
databases.
In our experience,
its multi-user and networking performance
rule it out for use on a larger scale.
has worked with a number of other SQL and non-SQL databases as well,
e.g. Empress, Access, File.
|