There are numerous relational databases out there, each providing their own different level of functionality. If you’re building standalone java application and want to distribute your application across multiple computers. Then this question frequently comes out: “Which Embedded database should I use in this situation? There are plenty of open source & commercial relational database available for you. In this article, I’ll attempt to summarize some of the most popular embedded databases (Mostly Open source) along with the process of selecting one that is best fit for you and your needs.
What is Embedded database?
Let’s first start with definition of database: A database can be considered as an organized collection of data. Database usually interacts with the application (web or standalone) to process the data as per the application requirement. Embedded database is usually used in situations where you need to distribute your java application across multiple computers.
In this discussion we’re going to refer to term footprint. Consider it as the size of JAR/DLL.We’ll discuss some database one by one.
Apache Derby
It can be used for java embedded programs & also for online transactions. JDBC and SQL are the programming API’s. It has footprint of about 2MB.Tested version 10.3.2, doesn’t have clustering feature & performance of the embedded java programs is much slower.
Mode: Embedded & Server
Platform: Cross-platform
License Type: Apache 2.0 license
For more information http://db.apache.org/derby/
H2 Database
This is most popular database engine. It offers faster performance, transaction isolation & clustering. It has footprint of about 1MB. It also allows encryption to the database. Being pure java API supports JDBC and SQL and CSV. For authentication it uses SHA-256 and salt, offers much security as well as data integrity.
Mode: Embedded & Server
Platform: Cross-platform
License Type: Mozzila MPL 1.1
For more information http://h2database.com
HSQLDB
This project was discontinued by author of H2 engine. Later it was revived by Hypersonic DB community. It supports of SQL-92, SQL-99, and SQL: 2003 standards. It has footprint of about 600KB.It has fast embedded server performance like H2.
HSQLDB is used by many open source applications like openoffice, InstallAnywhere etc. In practice it is also bundled with Hibernate library. But it doesn’t offer transaction isolation, clustering & database encryption.
Mode: Embedded & Server
Platform: Cross-platform
License Type: BSD license
For more information http://hsqldb.org/
PostgreSQL
PostgreSQL doesn’t offer embedded server functionality. But it can be used for your application in network mode or embedded using PL/Java. It has footprint of about 6mb.It offers built in language functionality PL/pgSQL similar to oracle’s PL/SQL. PostgreSQL also supports row-returning functions. Supports many other scripting languages like Perl, ruby, python, sh. It offers transaction isolation, clustering and cost based optimizer.
It is used by some of applications like: skype, Sunxvm & online websites like: hi5,BASF, Sony online etc.
Mode: Server
Platform: Cross-platform
License Type: BSD license
For more information http://postgresql.org
MySQL
Like PostgreSQL, MySQL doesn’t offer embedded mode explicitly. It requires external multithreaded library to link with application. It is most popular for network mode. It is used in various web applications & hosts. It has much less footprint than PostgreSQL about 4MB.It offers transaction isolation, clustering and cost based optimizer.
Mode: Network/(Embedded?)
Platform: Cross-platform
License Type: GPL license
For more information http://www.mysql.com/)
Discontinued Embedded Engines
There are some of the database which are discontinued but still they can be used for your java application. Some of the discontinued database engines are : McKoi, DaffodilDb and OneDb.
Choosing the engine
All above mentioned engines are open source and can be used to build even commercial applications. From above all H2 outperforms the list. Then Derby and HSQLDB engines. MySQL and PostgreSQL are not preferred for embedded applications, as they don’t offer this functionality implicitly. You’ve to work with multi-threaded libraries to get yourself started. There are many other factors to consider while choosing the engine like: Application requirements, Encrypted database, clustering & use of external libraries etc. So do some research according to your needs & then select the one that fits your job.
I hope the information above helped. I want to hear what you think of the choices above, or maybe fill me in on one I should have mentioned. Please don’t hesitate to give your feedback or suggestions.