There are two approaches in hibernate to handle batch insertion. Each one is explained below:
- One approach is related to Session class. Let's discuss it with example. Suppose there is an application for library which uses hibernate as ORM layer. Now you want to insert 1 million books in library. For simplicity domain model contains two classes one is book and second one is publisher. Publisher is contained in book class and i have enabled cascade insert and cascade update so if i insert book which contains publisher that is not in the system, it will also be saved with book. Code to insert books are given below:
Session session =HibernateUtil.getSessionFactory().openSession(); session.beginTransaction();
for (int index = 0; index <> Book book = new Book();
book.setAuthor("amer");
book.setIsbn("34343");
book.setName("Hibernate " + index);
Publisher pub = new Publisher();
pub.setName("Publisher " + index);
book.setPublisher(pub);
book.setPublishDate(new Date());
session.save(book);
}
session.getTransaction().commit();
session.close();
Above code is putting a lot of memory burden because whenever an object is saved, hibernate puts that object in cache which is called as "session cache" or "first level of cache" and probably you will face an error of Stack outofflow memory error. In order to avoid this, you have to clear the session, but question is when session should be cleared either after each insertion of after some interval. If you clear session, after each insertion, it will decrease your applications' performance dramatically because before calling clear operation on session, you have to call flush operation which will synchronize persistent data store to objects memory states. For this situation, there comes a concept of batch insertion. For batch insertion, first of all you will have to add "hibernate.jdbc.batch_size" property in your hibernate.cfg.xml file with value of 50.
session.save(book); if (index % 50== 0) { session.flush(); session.clear(); }
- Second way of doing batch insertion is through "StatelessSession" class. StatelessSession class differs from Session class in that it does not cache the objects, does not call interceptors, does not save any persistence context of object, does not cascade to composed objects, does not take care of collections, directly transfers the object to jdbc insert statement. So in other way it is more close to jdbc. With StatelessSession, you have to save composed objects separately e.g. in above example, you have to insert publisher and books separately. Above code with StatelessSession will be:
StatelessSession session = HibernateUtil.getSessionFacgtory().openStatelessSession(); session.beginTransaction();
for (int index = 0; index < style="font-style: italic; color: rgb(255, 153, 102);"> {
Book book = new Book();
book.setAuthor("amer");
book.setIsbn("34343");
book.setName("Hibernate " + index);
Publisher pub = new Publisher();
pub.setName("Publisher " + index);
book.setPublisher(pub);
book.setPublishDate(new Date());
session.insert(pub);
session.insert(book);
}
session.getTransaction().commit();
session.close();
- StatelessSession does not provide any performance over Session when you have to save composed instances too.
- It does not call any interceptors or evens which may complicate the application design if we heavily depend on interceptors or events for logging or data level security etc.
- Even for a plain object, StatelessSession does not provide me much performance over Session. It only provided me performance when an object contains one or two attributes. e.g. if i only insert publisher through StatelessSession, i will get performance by 20 to 30 seconds.
9 comments:
Awesome piece of comparison...
Thanks for sharing the knowledge which is encouraging for beginners like me....
Thanks Amer. Seriously informative piece of information. Very very helpful. Tahnk you so much.
Thanks. For this short and very intuitive knowledge sharing on hiberate. I'am having 4 lakhs of record which i have to insert . Used for and was expensive. this would help me.
I have insert 4 lakhs records, 4 times in my applicatio flow.
I am thinkin to go with StatelessSession as i have only two column in my table and recors are more.
Imran
Thanks!
I have created a complete maven project which has all the needed configs to do batch insert with mysql and some other tips as well. Pls see the details at http://sensiblerationalization.blogspot.com/2011/03/quick-tip-on-hibernate-batch-operation.html
Thanks... It was much helpful
Rama
Has anybody used a rollback on the session in the case of batch insert failing on one of the inserts with Duplicate constraint ? I want to rollback when there is even 1 error during the batch insert.
Is it possible to do the same using JPA?
Great post!
Batch inserts in Hibernate usually are a source of performance problems for beginners. I am writing a similar series where I optimize batch inserts with Hibernate and document how much we gain, check it out for more info:
http://korhner.github.io/hibernate/hibernate-performance-traps-part-2/
Post a Comment