images images images

While working on a recent project, I came into a situation where I needed to do an “exists” query, using a Criteria-style query. The online documentation for this feature is a little sparse, so I thought I’d share what I did.

The Pizza-shop Data Model (Again)

I keep reusing a data model for a Pizza shop in my posts, and this post will be no different. This data model first appeared in my JPA mapping tutorial. Here’s an ERD of the model again:

Find me Orders with Small Pizzas!
Given this model, what if we needed to find each order that contained a small pizza? Suppose your database had the following data:


As with my earlier posting, the object model has a PizzaOrder class that contains a Set of Pizza objects which correspond to each customer order. Your first inclination might be to do a criteria-within-a-criteria, like this:

Criteria criteria = Criteria.forClass(PizzaOrder.class);
criteria.createCriteria("pizza").add("pizza_size_id",1);
List
 ordersWithOneSmallPizza = criteria.list();

You’d be in for a bit of a surprise, though. While you might expect only two Pizza orders to be returned (namely, orders #1 and #2), you’ll actually have three orders in the result set; because order #2 has two small pizzas in it, order #2 will appear twice in your results!

The reason why this happens is pretty simple, and it becomes clear if you enable Hibernate’s SQL output feature. To locate all of the pizza orders which contain a small pizza, Hibernate needs to do an inner join to the PIZZA table. This is true regardless of whether you’ve mapped the Pizza objects to be fetched lazily; the join is required because of your query criteria, not because of your mappings. Note: it’d be really nice if Hibernate were clever enough to identify from the result set that it had duplicate PIZZA_ORDER records, and build the Set of Pizza objects accordingly, but I suspect that this would be a very difficult thing to do, so I’m not holding my breath.

The Right Way to Do It
What you’re really trying to do is to obtain all Pizza Orders where an associated small pizza exists. In other words, the SQL query that you’re trying to emulate is

SELECT *
  FROM PIZZA_ORDER
 WHERE EXISTS (SELECT 1
                 FROM PIZZA
                WHERE PIZZA.pizza_size_id = 1
                  AND PIZZA.pizza_order_id = PIZZA_ORDER.pizza_order_id)

The way that you do that is by using an “exists” Subquery, like this:

Criteria criteria = Criteria.forClass(PizzaOrder.class,"pizzaOrder");
DetachedCriteria sizeCriteria = DetachedCriteria.forClass(Pizza.class,"pizza");
sizeCriteria.add("pizza_size_id",1);
sizeCriteria.add(Property.forName("pizza.pizza_order_id").eqProperty("pizzaOrder.pizza_order_id"));
criteria.add(Subqueries.exists(sizeCriteria.setProjection(Projections.property("pizza.id"))));
List<pizzaOrder> ordersWithOneSmallPizza = criteria.list();

And voila, the result will contain two PizzaOrders!
Photo Credit: Squeaky Marmot

One of the ancillary projects of the Hibernate framework is the Hibernate Tools toolset. Using Hibernate Tools, you can automatically generate your mapping files (or, if you prefer, JPA annotations), POJOs, and DDL from your database schema.

I’ve been enamored with the “Convention over Configuration” web frameworks lately (e.g., Grails, Django), and wondered how hard it’d be to reproduce some of their magic ORM functionality using Hibernate Tools. I’ve concluded that I’ve still got a long ways to go, but that I might have some worthwhile tips to share in the meantime.

My progress has been impeded a bit because the documentation for Hibernate Tools is pretty weak. Hibernate Tools was really generated with Eclipse users in mind. A large portion of the documentation is devoted to explaining how to use the IDE. I’m not using Eclipse, and I intend to use the tools at the command-line, and command-line use really isn’t targeted.

0.) Get the tools and dependencies.
I’m still in the dark ages – I use ant instead of maven for builds, and need to track down jar dependencies the old fashioned way. In addition to the usual Hibernate jars, you’ll also need hibernate-tools, freemarker, and jtidy.


1.) Setup an Ant task

This is pretty straightforward, and it is explained fairly well in the documentation. First, define yourself a Hibernate Tools task. Mine looks like this:

<taskdef name="hibernatetool" classname="org.hibernate.tool.ant.HibernateToolTask">
  <classpath>
    <fileset refid="hibernate.libs" />
    <fileset refid="hibernatetools.libs" />
    <fileset refid="app.libs" />
    <fileset refid="compile.libs" />
    <pathelement path="${build}" />
    <pathelement path="${basedir}" />
  </classpath>
</taskdef>

Next, create a task that uses our new definition. We’re going to be creating the mapping files and POJOs in this example. It will look something like this:

<target name="gen_hibernate" depends="compile">
  <delete dir="${genhbm}" />
  <mkdir dir="${genhbm}" />
  <hibernatetool>
    <jdbcconfiguration
        configurationfile="${src}/hibernate-connection.cfg.xml"
        revengfile="hibernate.reveng.xml"
        packagename="us.mikedesjardins.data"
        detectmanytomany="true"
        detectoptimisticlock="true" />
    <hbm2hbmxml destdir="${genhbm}" />
    <hbm2java destdir="${genhbm}">
      <property key="jdk5" value="true" />
      <property key="ejb3" value="false" />
    </hbm2java>
  </hibernatetool>
</target>

You’ll note that I’ve created a hibernate configuration file above which only contains connection information. This is because I ran into problems when I tried to use a hibernate configuration with cache configuration elements in it. You’ll also note a reference to a file called hibernate.reveng.xml. This file controls various aspects of the reverse engineering process. Mine is very simple – I’m working with MS SQL Server, and I don’t want it to include any of the system tables which begin with the prefix sys:

<hibernate-reverse-engineering>
  <!-- Eliminate system tables  -->
  <table-filter name="sys.*" exclude="true">
</hibernate-reverse-engineering>

That should be all that you need to do to get started. When you run this ant target, your domain object POJOs and Mappings should end up in the ${genhbm} directory.

2.) First Problem – Table Names
Let’s say you’re in a company that prefixes most of it’s tables with something silly, like “tb_”. In that case, Hibernate tools will happily create all kinds of POJOs for you named TbAccount, TbAddress, etc. This is probably not what you want. Fortunately, Hibernate tools let you provide your own “Reverse Engineering Strategy” class to deal with situations just like this.

First, update your build target in Ant to include a reference to your strategy class as an attribute of the jdbcconfiguration element:

<target name="gen_hibernate" depends="compile">
  <delete dir="${genhbm}" />
  <mkdir dir="${genhbm}" />
  <hibernatetool>
    <jdbcconfiguration
        configurationfile="${src}/hibernate-connection.cfg.xml"
        revengfile="hibernate.reveng.xml"
        packagename="us.mikedesjardins.data"
        detectmanytomany="true"
        detectoptimisticlock="true"
        reversestrategy="us.mikedesjardins.hibernate.CustomReverseEngineeringStrategy"/>
    <hbm2hbmxml destdir="${genhbm}" />
    <hbm2java destdir="${genhbm}">
      <property key="jdk5" value="true" />
      <property key="ejb3" value="false" />
    </hbm2java>
  </hibernatetool>
</target>

Next, we’ll need to create the strategy class. The documentation recommends that you extend the DelegatingReverseEngineeringStrategy class to do this, like this:

package us.mikedesjardins.hibernate;
 
import org.hibernate.cfg.reveng.DelegatingReverseEngineeringStrategy;
import org.hibernate.cfg.reveng.ReverseEngineeringStrategy;
 
public class CustomReverseEngineeringStrategy extends DelegatingReverseEngineeringStrategy {
  public CustomReverseEngineeringStrategy(ReverseEngineeringStrategy delegate {
    super(delegate);
  }
}

Unfortunately, I was unable to find any JavaDoc documentation for the Hibernate Tools classes, so I had to do a bit of trial-and-error. It turns out there is a method called tableToClassName that can be overridden. This class accepts a TableIdentifier as its input parameter, and returns the class name in a String. Thus, to remove the tb_ from the class, we could do something like this:

public String tableToClassName(TableIdentifier tableIdentifier) {
  String packageName = "us.mikedesjardins.data";
  String className = super.tableToClassName(tableIdentifier);
  if (className.startsWith("Tb")) {
    className = className.substring(2);
  }
  return className;
}

Compile your CustomReverseEngineeringStrategy class, make sure it’s on the hibernatetools task definition’s classpath, and re-run the task. Voila! The Tb’s are gone!

There’s a similar method for column names named columnToPropertyName that handles naming your persisted class’s member variables. In my current project, the primary key columns are named the same as the table name, with “_id” appended to the end. However, in the object model, we like to just name the primary key property “id” to simplify the creation of generic DAOs. I use the columnToPropertyName method for this.

2.) Next Problem – The Generated POJOs need tweaking.
Perhaps the code that Hibernate generates is not to your liking. Maybe you work with standards-compliance-nazis who want a copyright header at the top of each class file. Or maybe all of your persisted objects implement the same interface for use with generic DAOs.

Under the covers, Hibernate uses freemarker to generate POJOs and mapping files. The templates that it uses can be edited to your liking, but doing it is not very straightforward.

First, unzip the hibernate tools jar into a temporary directory. Once unzipped, you’ll find a directory named pojo. This directory contains all of the templates used for POJO generation. Copy that directory into your build area and name it something like hib_templates.

The hbm2java task is actually just an alias for another hibernate tools target called hbmtemplate. With hbmtemplate, you can configure the location of the source templates. So we’ll remove the hbm2java task from the gen_hibernate target, and replace it with an equivalent hbmtemplate task:

<target name="gen_hibernate" depends="compile">
  <delete dir="${genhbm}" />
  <mkdir dir="${genhbm}" />
  <hibernatetool>
    <jdbcconfiguration
        configurationfile="${src}/hibernate-connection.cfg.xml"
        revengfile="hibernate.reveng.xml"
        packagename="us.mikedesjardins.data"
        detectmanytomany="true"
        detectoptimisticlock="true"
        reversestrategy="us.mikedesjardins.hibernate.CustomReverseEngineeringStrategy"/>
     <hbm2hbmxml destdir="${genhbm}" />
     <hbmtemplate templateprefix="pojo/"
              destdir="${genhbm}"
              template="hib_templates/Pojo.ftl"
              filepattern="{package-name}/{class-name}.java">
       <property key="jdk5" value="true">
       <property key="ejb3" value="false">
    </hbmtemplate>
  </hibernatetool>
</target>

Now that we’ve told hibernate tools where our template lives, we can edit it to our liking. For example, if we want to add a copyright notice to the top of each generated POJO, we could do so thusly:

//
// Copyright 2008 Big Amalgamated Mega Global Software Corp.  All Rights Reserved.
//
${pojo.getPackageDeclaration()}
// Generated ${date} by Hibernate Tools ${version}
 
<#assign classbody>
<#include "PojoTypeDeclaration.ftl"/>; {
.
.
.
(etc)

Now that we’ve done this, we can create more templates that, e.g., generate empty DAO classes, automatically create derived classes, etc.

Hope that helps!

Photo Credit: geishaboy500

In a recent post, I showed a trick for determining which users in a system are running a long-running query. One commenter suggested that using a full-text search system made a lot of sense in those situations, and I wholeheartedly agree. So I decided to devote a new post to Hibernate Search!

In this example, I provide a live, online interactive search of an online cheese database, and you can download the example (more on that at the end of the post).

Step One – The Test Data
I spent the most time on this project was creating test data for the example program. I headed over to Freebase to see what was available for data sets. Among lots of other things, they have a free database of cheese. First, I downloaded the data in TSV format, dumped it into a raw table, and massaged it into a relational, normalized schema. I ended up with this when I was done:

So, a CHEESE can have only one ORIGIN (a country or region where the cheese is made), is made from one-to-many types of MILK, and may have zero-to-many TEXTURES associated with it.

Step Two – Build the Domain Model
The domain model for this system is very simple. Each Cheese class has an Origin, and a set of Milk and Textures:

@Entity @Table(name="MILK")
public class Milk {
  @Id @Column(name="milk_id")
  private Integer id;
 
  @Basic @Column(name="name")
  private String name;
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}
@Entity @Table(name="ORIGIN")
public class Origin {
  @Id @Column(name="origin_id")
  private Integer id;
 
  @Basic @Column(name="name")
  private String name;
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}
@Entity @Table(name="TEXTURE")
public class Texture {
  @Id @Column(name="texture_id")
  private Integer id;
 
  @Basic @Column(name="description")
  private String description;
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}
@Entity @Table(name="CHEESE")
public class Cheese {
  @Id @GeneratedValue(strategy=GenerationType.IDENTITY)
  @Column(name="cheese_id",nullable=false,unique=true)
  private Integer id;
 
  @Basic @Column(name="name")
  private String name;
 
  @ManyToOne(cascade={CascadeType.ALL})
  @JoinColumn(name="origin_id",nullable=false)
  private Origin origin;
 
  @ManyToMany(cascade={CascadeType.PERSIST,CascadeType.MERGE})
  @JoinTable(name="CHEESE_MILK_MAP",
             joinColumns=@JoinColumn(name="cheese_id"),
             inverseJoinColumns=@JoinColumn(name="milk_id"))
  private Set<milk> milks = new HashSet<milk>();
 
  @ManyToMany(cascade={CascadeType.PERSIST,CascadeType.MERGE})
  @JoinTable(name="CHEESE_TEXTURE_MAP",
             joinColumns=@JoinColumn(name="cheese_id"),
             inverseJoinColumns=@JoinColumn(name="texture_id"))
  private Set<texture> textures = new HashSet<texture>();
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}

Step Three – Add Search Annotations and Configure Lucene
Behind the scenes, Hibernate Search uses the Apache Lucene search engine to do its indexing. In short, it maintains a mapping of object IDs to search terms in an external file, and updates the file when objects are added, updated, or deleted. To start using Hibernate Search, you’ll need to configure the location of these index files, as well as a search directory provider (we’ll just use the default). This is done in your Hibernate properties file, or (if you use JPA, like me), in persistence.xml:

<property name="hibernate.search.default.directory_provider" value="org.hibernate.search.store.FSDirectoryProvider" />
<property name="hibernate.search.default.indexBase" value="/var/lucene/cheese-indexes" />

Next, you need to indicate to Lucene which classes need to be indexed. You also need to indicate which data fields 1.) contain the document ID, and 2.) contain relevant search text. In our example, we only need to index the Cheese objects. We want to allow users to search on cheese name, milk name, origin, and texture.

First, we indicate that we want to index the Cheese objects by applying the @Indexed annotation to the class, and we elect to use the Id field to identify the Cheese objects to Lucene by applying a @DocumentId annotation to it. Next, we indicate that the cheese name contains searchable text by adding the @Field(index=Index.TOKENIZED, store=Store.NO) annotation to it. The annotation parameters are informing Hibernate Search to use Lucene’s default tokenizer to summarize the text, and not to store a copy of the document content.

We also want to allow users to search on Origin, Milk Name, and Texture. This text is not contained within the Cheese object, instead they’re in related object. So we need to add the @IndexEmbedded annotation to the member variables in the Cheese class that refer to the objects which contain the searchable text, and we also need to add the @Field(index=Index.TOKENIZED, store=Store.NO) annotation to the Milk, Origin, and Texture classes to indicate which fields are searchable.

When you’re done, the modified domain classes will look like this:

@Entity @Table(name="MILK")
public class Milk {
  @Id @Column(name="milk_id")
  private Integer id;
 
  @Field(index=Index.TOKENIZED)
  @Basic @Column(name="name")
  private String name;
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}
@Entity @Table(name="ORIGIN")
public class Origin {
  @Id @Column(name="origin_id")
  private Integer id;
 
  @Field(index=Index.TOKENIZED)
  @Basic @Column(name="name")
  private String name;
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}
@Entity @Table(name="TEXTURE")
public class Texture {
  @Id @Column(name="texture_id")
  private Integer id;
 
  @Field(index=Index.TOKENIZED)
  @Basic @Column(name="description")
  private String description;
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}
@Indexed
@Entity @Table(name="CHEESE")
public class Cheese {
  @DocumentId
  @Id @GeneratedValue(strategy=GenerationType.IDENTITY)
  @Column(name="cheese_id",nullable=false,unique=true)
  private Integer id;
 
  @Basic @Column(name="name")
  @Field(index=Index.TOKENIZED, store=Store.NO)
  private String name;
 
  @IndexedEmbedded
  @ManyToOne(cascade={CascadeType.ALL})
  @JoinColumn(name="origin_id",nullable=false)
  private Origin origin;
 
  @IndexedEmbedded
  @ManyToMany(cascade={CascadeType.PERSIST,CascadeType.MERGE})
  @JoinTable(name="CHEESE_MILK_MAP",
             joinColumns=@JoinColumn(name="cheese_id"),
             inverseJoinColumns=@JoinColumn(name="milk_id"))
  private Set<milk> milks = new HashSet<milk>();
 
  @IndexedEmbedded
  @ManyToMany(cascade={CascadeType.PERSIST,CascadeType.MERGE})
  @JoinTable(name="CHEESE_TEXTURE_MAP",
             joinColumns=@JoinColumn(name="cheese_id"),
             inverseJoinColumns=@JoinColumn(name="texture_id"))
  private Set<texture> textures = new HashSet<texture>();
 
  @Version @Column(name="version")
  private Integer version;
 
// Accessors omitted
}

Step Four – The Servlets
In this example, I didn’t want to rely on any web frameworks or even on JSPs, so I wrote a good old-fashioned servlet to exercise the search function. No sane person would ever do it this way. There are actually two servlets in the example – one for application initialization and one for the page itself.

The initialization servlet does the work of indexing all of the database data the first time through. For our small data set, this takes less than a minute. For larger data sets, it wouldn’t make sense to re-index everything every time you start the application. The initialization code iterates over all of the Cheese objects ant tells the FullTextEntityManger to index it:

public void init() {
  Dao<cheese> dao = new CheeseDao();
  EntityManager em = dao.getEntityManager();
  FullTextEntityManager fullTextEntityManager = Search.createFullTextEntityManager(em);
 
  List<cheese> cheeses = em.createQuery("select c from Cheese as c").getResultList();
  for (Cheese cheese : cheeses) {
    fullTextEntityManager.index(cheese);
  }
}

The main page servlet has some code in the doPost method to perform the search based on the contents of the text form field. That code looks like this (I’ve shortened it up a bit here by removing some error checking and HTML output):

public void doPost(HttpServletRequest request,
                   HttpServletResponse response) throws ServletException, IOException {
  response.setContentType("text/html");
  PrintWriter out = response.getWriter();
  emitHeader(out);
 
  String searchTerm = request.getParameter("searchterm");
  EntityManager em = dao.getEntityManager();
 
  FullTextEntityManager fullTextEntityManager =
    org.hibernate.search.jpa.Search.createFullTextEntityManager(em);
  MultiFieldQueryParser parser =
    new MultiFieldQueryParser( new String[]{"name",
                                            "origin.name",
                                            "milks.name",
                                            "textures.description"},
                               new StandardAnalyzer());
 
  try {
    org.apache.lucene.search.Query query = parser.parse(searchTerm);
    javax.persistence.Query hibQuery =
      fullTextEntityManager.createFullTextQuery(query,Cheese.class);
    List<cheese> result = hibQuery.getResultList();
    emitTable(out,result);
  } catch (ParseException e) {
    log.error("Got a parse exception", e);
    throw new ServletException(e.getMessage());
  }
  emitFooter(out);
  out.close();
}

That’s all there is to it! As you can see, setting up Hibernate Search is very simple. Most of the effort for this project was spent creating the data and making the servlets work.

Enjoy the Finished Product
To see this less-than-world-changing application in action, visit it here. You can also download the whole eclipse project and try it out for yourself. It comes with SQL dumps suitable for MySQL and PostgreSQL, and it has been tested with both environments under Tomcat 5.5.

Photo Credit: Chris Buecheler

Your Problem
You have a data model with table that contains data you want to aggregate. For instance (returning to my venerable Pizza Shop example), let’s say you have a PRODUCT table that enumerates the items your pizza shops sells, and a LOCATION table that contains all of your retail locations:

We also have a table that contains the sum of all of the previous days’ sales, broken down by PRODUCT and LOCATION:

This is all well and good, but we’ve been asked to create an Executive Dashboard for the President of the pizza chain, and she would like to see daily sales by product. She is not interested in a breakdown by location.

We could tally it up client side…
What if we just loaded the entire table into the client, and iterated over the per-product results, and present that? Unfortunately, ORM libraries are pretty stupid in situations like these, and will generate all kinds of expensive reads to the database when you try to solve the problem this way. If you want to learn more about lazy loading, and why you shouldn’t iterate over a collection, check out my earlier post here.

We could just create a view in the Database…
We could just create a view, and aggregate the data there. Then we can easily create a Hibernate mapping to that view. The query for the view is simple:

1
2
3
4
CREATE VIEW sales_by_product_view AS
SELECT product_id, SUM(total_sales) AS total_sales
  FROM YESTERDAY_SALES
 GROUP BY product_id

However, we are working with a tyrannical DBA. She is not keen on proliferating views throughout our otherwise pristine schema every time the President has decided that the company needs a new widget for the executive dashboard application.

…or we could fake it with a custom loader
Instead, let’s create a Hibernate mapping that generates the same results as the view. First, let’s create a simple POJO to contain the results:

1
2
3
4
5
6
7
package us.mikedesjardins;
public class SalesByProduct {
  private Integer id;
  private Integer productId;
  private BigDecimal sales;
// accessors omitted
}

The corresponding mapping file would look like this. I re-used the product_id for the ID in this example. Note the loader element:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
<hibernate-mapping package="us.mikedesjardins">
  <class name="SalesByProduct"
          dynamic-insert="false"
          dynamic-update="false"
          mutable="false">
    <id name="id" type="int" unsaved-value="null">
      <column name="__id" sql-type="int identity" not-null="true" unique="true" />
    </id>
    <property name="productId" type="int">
      <column name="__product_id" not-null="false" />
    </property>
    <property name="sales" type="java.math.BigDecimal">
      <column name="__total_sales" not-null="false" />
    </property>
    <loader query-ref="salesByProductQuery" />
  </class>
  <sql-query name="salesByProductQuery">
    <return class="SalesByProduct" />
    <![CDATA[
    select product_id as __id
         , product_id as __product_id
         , sum(total_sales) as __total_sales
      from YESTERDAY_SALES
     where product_id = :product_id
    group by product_id
   ]]>
  </sql-query>
</hibernate-mapping>

Note that you need to put a positional argument in the query, or Hibernate will get nasty about parsing it. Hope that helps!

Yesterday I read a blog post by Kenneth Downs entitled “Why I Do Not Use ORM” on his The Database Programmer blog. It wasn’t the first blog post with gripes about Object Relational Mapping, and it certainly won’t be the last. For me, this particular article highlighted a few misconceptions about why and how ORM should be used, and I thought I might chime in with my own perspective.

ORM is not a way to avoid SQL
The first thing that stood out for me was this quote:

“The language SQL is the most widely supported, implemented, and used way to connect to databases. But since most of us have long lists of complaints about the language, we end up writing abstraction layers that make it easier for us to avoid coding SQL directly.”

To his credit, the author wasn’t actually addressing ORM in this paragraph. However, anyone writing business logic which interacts with a database, who is unable to write some basic SQL, is like a bull in a china shop; nothing good will come of it regardless of the tools and abstraction layers employed.

The Simple Example
The next example in the post shows how one would write a row to a database in only four lines of PHP code – it looks something like this (I removed the comments):

$row = getPostStartingWith("inp_");
$table_id = myGetPostVarFunction('table_id');
if (!SQLX_Insert($table_id,$row)) {
 myFrameworkErrorReporting();
}

I don’t know PHP, but I think the code is reading every field in a posted form that starts with inp_, generates an insert statement straight from the ID’s on the form fields, and writes an insert statement with the results.

Perhaps it’s unfair to criticize this code because “it’s just a simple example,” but if this code is being held up as an example of how short and simple non-ORM code can be, one does have to wonder * When the database schema changes, does the HTML need to be updated so that the form fields match the database schema? * Where are the transactions? What if I need to insert into several tables and roll back the transaction if one fails? * Does the handy SQLX_insert method prevent SQL injection attacks?

He goes on to say that this task is made even easier by using a data dictionary to generate the SQL. After reading the “Using a Data Dictionary” article, one has to wonder whether or not the author realizes that it is a very crude form of ORM.

What about Business Logic?
Kenneth Downs tries to head-off any arguments about business logic before they come up, knowing that ORM evangelists will argue that the domain objects can encapsulate essential business logic for the application. His response?

“The SQLX_Insert() routine can call out to functions (fast) or objects (much slower) that massage data before and after the insert operation. I will be demonstrating some of these techniques in future essays, but of course the best permforming and safest method is to use triggers.”

For me, this sounds alarm bells. Triggers slow down transactions. Triggers are in your database, which is often your system’s biggest bottleneck. Triggers silently do things behind your back without telling you. Triggers change databases from vast, efficient places to store relational data, into a lumbering behemoth interpreting procedural code inside big iron.

Conversely, business logic that can be easily distributed across many smaller web servers scales horizontally. The domain layer is a fantastic place to embed simple data massaging – sadly, I often see a pile of persistent entities with getters and setters that don’t do anything.

Counter Example: The Disaster Scenario
Lastly, Ken (can I call him that? What is the ettiquite for this sort of thing, anyway?) shows an example of a piece of code that is likely to cause hundreds of unneeded reads to the database in an untuned ORM-based system. I don’t dispute this; in fact, I posted about an almost identical nightmare scenario myself a while ago.

For this, I go back to my “bull in a china shop” analogy. Programmers can write horrible code in any language, with any tool. Layers of abstraction are a double-edged sword, because you need to understand what they are doing for you. But it’s not the tool’s fault; it’s the person misusing it.

Computer Science’s Vietnam
In 2004, Ted Neward famously called ORM Computer Science’s Vietnam. Encouraged by early successes, we got sucked into the quagmire. There are plenty of reasons to be frustrated with ORM, but I’m not sure I agree with Ken’s. I try to hit the 80/20 rule with ORM, and use it where it makes sense. When I get into a convoluted transaction or need to do a large batch of operations, I’m not afraid to dive into SQL and do the work in a stored procedure. I think it’s a good mix. How about you?

Photo Credit: Ryan Dickey

If you develop Hibernate applications against either Sybase or MS SQL Server databases, you may have had the pleasure of seeing this little gem:

1
2
3
4
5
6
7
8
9
09:13:22,155 WARN [] org.hibernate.util.JDBCExceptionReporter – SQL Error: 0, SQLState: 22001
09:13:22,155 ERROR [] org.hibernate.util.JDBCExceptionReporter – Data truncation
09:13:22,155 WARN [] org.hibernate.util.JDBCExceptionReporter – SQL Error: 8152, SQLState: 22001
09:13:22,155 ERROR [] org.hibernate.util.JDBCExceptionReporter – String or binary data would be truncated.
09:13:22,159 ERROR [] org.hibernate.event.def.AbstractFlushingEventListener – Could not synchronize database state with session
.
.
.
Caused by: java.sql.DataTruncation: Data truncation

The most maddening thing about these errors is that the database won’t tell you which column is being truncated. A likely scenario is that you’ve developed a Web or Swing UI that permits the user to enter a value for a String field which is ultimately persisted to a database column, and the UI is not restricting the maximum length of the user’s input to the size of that the column supports. The question is, which field is the offender?

Hibernate Validators to the Rescue!
Situations like this are why it’s a pretty good idea to validate your data at the domain layer of your project, and not rely on your database and presentation layer to do all of the validation. Fortunately, Hibernate makes this pretty painless with validator annotations. Just import the org.hibernate.validator.Length annotation, and add the @Length annotation in your entity class, e.g.:

1
2
3
@Column(name="description",length=20)
@Length(max=20)
private String description;

Note that putting the length attribute in the @Column annotation is not enough. That attribute is used for generating DDL, but will not cause any validation to take place.

Now that our new Annotation is in place, we get the slightly more useful stack-trace below:

1
2
3
4
5
6
7
org.hibernate.validator.InvalidStateException: validation failed for: us.mikedesjardins.data.persist.MyClass
at org.hibernate.validator.event.ValidateEventListener.validate(ValidateEventListener.java:148)
at org.hibernate.validator.event.ValidateEventListener.onPreInsert(ValidateEventListener.java:172)
at org.hibernate.action.EntityIdentityInsertAction.preInsert(EntityIdentityInsertAction.java:119)
.
.
.

Of course, you should catch these exceptions and do something less hostile with them. For now, we’ll just log the details and re-throw so you can get the general gist of what you can do. Here’s a generic insert method in a DAO base class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public void insert(T valueObject, Class... clazz) {
 try {
   Session s = getSession();
   s.save(valueObject);
   s.flush();
   s.refresh(valueObject);
 } catch (InvalidStateException v) {
   setToRollback();
   log.error("insert(), failed validation " + valueObject.toString(), v);
   InvalidValue[] invalid = v.getInvalidValues();
   for (int i=; i<invalid.length; ++i) {
     InvalidValue bad = invalid[i];
     log.error("insert(), " + bad.getPropertyPath()
     + ":" + bad.getPropertyName()
     + ":" + bad.getMessage());
   }
   throw v;
 }
}

As you can see, the InvalidStateException contains an array of InvalidValue objects which describe the nature of the validation error. You can either log this information, or even present it to the user.

Hope that helps! Do you use Hibernate’s validator annotations for something nifty? Let us know in the comments!

(to the tune of The Spiderman Theme Song from the original cartoon series)

Hibernate, Hibernate
O/R Mapping sure is great
Gavin King, and H-Q-L
Make my life a living hell
Look out! Here comes Hibernate!

Mapping Files in Hibernate
Wish my team would annotate
Are its queries optimized?
Who knows – it’s always a surprise
Hey there, there goes Hibernate!

Don’t fetch data now
Lazy load, ‘cuz you might guess
I don’t need it now
Hibernate, you do know best!

Hibernate, Hibernate
Friendly neighborhood Hibernate
Grab the jars, and climb aboard
OO models are your reward
To it…
Relational DBs are old-school
When you need a complex tool
You need the Hibernate!

So I guess it’s official: Verizon Wireless is acquiring rival Alltel Wireless for $28 billion. I had another topic in mind for my next post, but I decided to write about this market consolidation instead.

In a former life, I designed and did programming for billing systems for “tier three” (i.e., small, regional) wireless carriers. In many ways, it’s saddening to see what the mobile carrier landscape has become in the United States. One of the things that made it fun to work in the industry was the funky, inspired little companies that were created by scrappy hometown entrepreneurs. It took a certain amount of chutzpah and ingenuity to take on the behemoth teleco’s, even if it was a small rural corner of East Podunk, U.S.A.

Lots of people worked for the “Cellular Ones” of the world. I met some folks who got pretty wealthy by building medium-sized businesses by taking on the incumbent wireline carrier in their neck of the woods. These people hailed from small towns like Beckley, West Virginia, or Fort Morgan, Colorado, or Traverse City, Michigan. A lot of the employees were self-trained and didn’t have years of experience as Billing Directors or Network Engineers; sometimes it was so-and-so’s brother-in-law who was appointed to be the “Switch Tech” because he was good with electronic stuff.

I believe that these homegrown businesses are good for America. The people who operate them care about their communities because they see their neighbors every day. They hire local people to staff their call centers instead of outsourcing to distant continents. Their leaders do business with their local friends. These businesses help give their part of the world its own personality.

At one time, I thought that MVNO’s might fill this void – I had hoped that they could supplement the market with their own quirky personalities. But so far, MVNO’s have failed to gain much traction in the U.S.

I realize that some good things will come from consolidation. Bigger companies can often roll out better technology more quickly. It’s harder for, e.g., a small time operation in Decatur, Illinois to roll out 3G data services, than it is for a Goliath like AT&T or Verizon Wireless. But I still mourn the loss of the tier-3′s; for me, they exemplified the American entrepreneurial spirit.

Photo Credit: KB35

The Dreaded Search Function
I’ve worked at several jobs where the users have asked for a general-purpose search function in their application. It’s the sort of thing where people want the ability to search on, e.g., all customers with a last name that starts with SM, or all of the customers with a balance greater than 1000.00, etc.

If you can resist the request to build such a thing, then by all means, don’t implement it. End-users have a knack for creating queries which will bring your database to its knees, no matter how clever you think you are with indexing or UI design.

If you must do it, you’ll want to keep a close eye on how its used from the outset for when the inevitable nightmare query is run. One way to do this is by using Hibernate’s comments feature.

Use Comments to Hunt them Down
My colleague Matt Brock implemented something like this where we work. Suppose you have a general method for creating an HQL query in a DAO class, the method could look like this:

1
2
3
protected Query createQuery(String hql) {
return getSession().createQuery(hql).setComment(getUsername());
}

Next, make sure Hibernate’s SQL logging and comment logging is enabled in your hibernate.hbm.xml (warning – this log can get pretty huge):

 
true
true
.
.
.

Your log will now contain the users who are executing each query, like this:

1
2
Hibernate: / johndoe / select customer0.customer_id as col_0_0, customer0.cust_company_name from CUSTOMER customer0 where (customer0.customer_id=12136 or customer0.customer_id=16884 or customer0.customer_id=11150 or customer0.customer_id=155 or customer0.customer_id
=27265 or customer0.customer_id=697 or customer0.customer_id=4133 or customer0.customer_id=248 or customer0.customer_id=2550 or customer0.customer_id=8449)

One caveat: this may not work with stored procedures – SQL Server, in particular, will get nasty if you try to execute a stored procedure with Hibernate comments attached to it.

Happy Hibernating!

I’m a big fan of Nate Weiner’s Idea Shower. He recently wrote a really good blog post entitled “What Will the Web See When You Die?” In it, he wrote about the death of a snowboarding colleague, and how the traditional media publications cobbled a rather terse biography of the man by copying some of his profile information from a company website. His post was a good read. Go read it. Check out his other stuff, too, he’s brilliant.

Anyway, the story reminded me of one of my former colleagues who committed suicide a few years ago (in fact, I left a comment about it on Nate’s blog… most of this post is just an expansion of that comment!). He died several years after we had drifted apart (he had moved out to the west coast, and I moved back up to Maine), so I didn’t find out about it until months after it happened. After I found out, the first thing that I did was go to his web site. There he was, smiling at the camera for a blog entry about how his and his girlfriend’s guacamole dip recipe. I found it rather eerie that his web site stayed up for so long after his death. I found myself re-visiting it days later, perhaps I was half-expecting new content to magically appear. Eventually the site just disappeared into the ether.

At the time I wondered if I should keep a copy of it and host it as sort of a tribute to him, but in the end I decided it was better to let the site go with him; keeping it was like the parents you read about who lose a child and can’t bring themselves to redecorate their room. I doubt he would’ve wanted that.

In this age of caching pages and leaving parts of yourself scattered over the social web, it’s interesting to think that digital bits of your life will live on long after you die. Our lives are far shorter than the magnetic tapes and spinning disks that prop up the web. In fact, my friend’s pages are still in the internet wayback machine, so I didn’t even need to save them.

Photo Credit: ReefRaff