JPA/Hibernate - object proxying (behind the scenes)
"Having people with difficulties of sharing knowledge is perhaps a lot more danger than having people with no knowledge at all". Mário Júnior
There are many frameworks out there, Java is a frameworking land. Using frameworks saves us some time, because we wont have to deal with a specific set of problems, simply because someone dealt with them to shield us, but, don’t you wonder "how is the magic done?".
IMHO, there decisions that the frameworks you are letting into your solution had to take and you should be aware of, so, my advice is: trust the framework, but find some time to understand how the magic works, because again, that will save you more time.
But more than saving you time, it gives you the possibility of making contributions to the framework project and that’s definitively something that worth mentioning for me as an open-source contributor. "Lets stop acting as simply consumers in the producer-consumer chain", let's be producers and contribute to the future of our beloved frameworks (all of them would be impossible). If we cant contribute with code, then lets contribute with opinions and thoughts.
Sorry for that speech, I was possessed by some kind of revolutionary spirit. Lets go straight to the point: JPA and Hibernate.
JPA is a JAVA EE specification/standard dedicated to data persistence. I wont be laying the whole "persistence providers, persistence unit" stuff. My focus is how Hibernate uses object proxying and the implications of how they work, so you better already know what are JPA and Hibernate.
Intro
A few years ago i was convinced that the only way to proxy one object in runtime was through the interface it implements, meaning: If A implements B, to create a proxy object of A, I would need an object C that implements B and wraps A. I was convinced that I couldn’t create a proxy object in runtime if the object I wanted to proxy had no interface.
I kept learning new stuff, and you might be tempted to believe I know too much by now, but, I'm just a learner who feels excited about sharing what he just learned. I later figured out that there was another way of proxying objects in runtime that I wasn’t aware of. This new way, is related to a simple and very important principle in Java:
You can add new code to your Java application during runtime.
This is possible thanks to instrumentation and class loaders.
Instrumentation as an object proxying opportunity
When we start learning java, the first thing we put into our minds is that the program starts from the main method. Well, this is partially true, because you can have a method that will be executed before the main method, which is premain. This method is where you are supposed to transform classes before the main method execution. You might add new methods to the class, add new attributes, remove the ones that exist, do whatever you want, even return a replacement class or you can even define new classes.
The premain method is the way the JVM allows you to do instrumentation directly from your Java application. You might be like "how is any of this useful?". You can use that to ease profiling, for example: you could transform the methods of some classes, adding code to log the time each method is taking to process. Obviously there tools already that will do this for you.
I brought the instrumentation concept because it it gives us a chance to proxy objects in a way that I personally define as inheritance based proxying.
The inheritance based proxying is based in the following statements :
- Object proxying is a process that only affects public methods
- Public methods are available for sub-classes
- A sub-class that overrides all the public methods of the its super-class can be considered a proxy class.
It's all about object orientation, mostly about inheritance and it really pissed me off, because I couldn’t believe I was missing such a basic principle.
Let's clarify the inheritance based proxying:
If class B extends class A, then, Every object instance of B is an A. If class B overrides all public methods of A, then B is a proxying class.
This inheritance based proxying thing is something really interesting but, how is any of this related to JPA and Hibernate? Well, Hibernate uses object proxying and I'm about to explain why and when:
Hibernate and object proxying
The first thing I want you to know, is that Hibernate is not the only framework relaying on object proxying, most of them do the same, for example: CDI, Spring.
When we are working with related entities in JPA, we get to decide the strategy we want to use to fetch related entities and there two possible strategies available: LAZY and EAGER. When we choose LAZY, hibernate will only fetch the related Entity instance(s) if we invoke the getter method. For example, for the following entity class:
@Entity
public class Deposit extends BaseEntity {
private double amount;
@ManyToOne(fetch=FetchType.LAZY,optional=false)
private Account account;
public Account getAccount(){
return account;
}
public void setAccount(Account a){
this.account
}
//other getters and setter may come here
}
Hibernate will create a proxy class during runtime and will override the getAccount method, so that it can go to your database and fetch the record once you invoke that method, that’s how the LAZY fetching works. It's not magic, see.
There is another interesting optimization that Hibernate does when dealing with collections. Lets see another entity example:
public class Clustomer extends BaseEntity {
@ManyToOne(fetch=FetchType.LAZY,mappedBy="customer")
private List<Account> accounts;
public List<Account> getAccounts(){
return this.accounts;
}
public void setAccounts(List<Accounts> accs){
this.accounts = accs;
}
}
In this case, hibernate will also proxy the getAccounts method in order to make the LAZY fetching possible and it will also return a custom implementation of the List interface. If you thought it was returning you an ArrayList, then, im sorry for disappointing you: its actually returning a PersistentList instance object. This PersistentList hibernate is returning to you is a data-proxy. You might think it contains object instances, but when hibernate returns it, the probability of it being empty is 100%. Hibernate will fetch records as you iterate and interact with/on it, meaning: if you invoke the getAccounts method and then you do nothing with the returned List object, no query will be fired, which is really interesting.
Talking about PersistentList is a bonus, because its out of the scope of the current article, but I'm tempted to reveal things, so, let me just say:
If the accounts variable was of type ArrayList instead of List, it wouldn’t work because Hibernate wouldn't proxy the ArrayList class. The PersistentList implements the List interface, it doesn’t extend or proxy ArrayList. Remember when I said "Knowing how the thing works will save you some time?", now you might be like "Yeah", because if you know this, you wont spend hours trying to find out why using ArrayList doesn’t work with JPA mapping.
Okay. I have given you, my dear reader, a basic overview of how and why hibernate uses inheritance based proxying, lets now see, the implications of that.
Inheritance based proxying implications on Hibernate
I was seating on a chair, at home, coding, as always when a message popped-up in my Telegram messenger. It was a friend sharing a problem with hibernate: He was having some StackOverflowException and he wasn’t getting the reason. He kinda motivated me to write this article and this specific section is related to that day.
"Objects returned by Hibernate must not be JSON serialized".
This is a statement of Mario Junior, a 24 years old coder from Africa with no International credit , then why should you believe? Because i can convince you to: They weren't mean to be JSON serialized and they will probably shoot you with a StackOverflowException when you try to do that. The chances of you having that exception while serializing entity instance objects to JSON is 99% and here is why:
Imagine a database model with two entities: Order and OrderItem. The Order entity has a collection attribute named items (LAZY FETCH) which represents OrderItem instances related to it, on the other hand, the OrderItem entity has an order attribute (LAZY FETCH) which represents the Order instance to which it belongs.
If you run the following query:
Select item from OrderItem item
And then you attempt to serialize the resulting objects to JSON, a, StackOverflowException will be thrown because you will be entering an infinite loop. Object-to JSON serialization (with Jackson and Gson) is based in the principle that only public attributes must be serialized and if a private attribute has a getter and setter, then it must also be serialized too, theeeeen:
An OrderItem object has a method named getOrder which will be proxied by Hibernate so that it can go to the database and fetch the Order object. Once the that object is fetched, it will immediately be serialized, which means the getItems method will be invoked and a List of OrderItem instances will be returned, which must also be immediately serialized, and the cycle never stops: Its an infinite loop.
This is why I say: Do not JSON serialize JPA Entity beans.
There is one thing you might have missed when I was talking about instrumentation. I know you remember the premain method, but, you know there is no main method in JAVA EE, meaning there is also no premain. Then, how and when are classes transformed?
There is no premain in Java Web/EE container
To be honest, Hibernate doesn’t rely on premain. In java, classes are defined by class loaders. Loading a class is not the same with defining. Loading a class is simply reading the bytecode and parsing it for analysis without publishing the class defined in that bytecode, while defining a class is to publish the class represented in a bytecode sequence/stream.
Hibernate scans the bytecode of the classes, of-course it uses some bytecode parser like javaassist and based in the meta-data returned by such tool it creates the proxying classes and then publishes/defines such classes so that they can be available. There is a lot more going on while generating the new classes, but its out of the scope of this article which is already huuuuuuuuuuge.
Thank you for joining me. I wish you could leave some comment down there. Remember what I said up there about contribution? Yes, get used to that.
It's always a pleasure.
Software Developer
8yMuito bom o artigo. É importante termos essa visão de como as coisas funcionam por dentro dos frameworks. Fiquei esclarecido em relação a algumas coisas mais. E entender "proxeamentos" do hibernate pode ajudar também a evitar alguns problemas como o famoso LazyInitializationException. Em um próximo artigo podes falar de quando usar (ou como usar correctamente ) o LAZY e EAGER que costumam criar também algumas dores de cabeça. Keep that "knowledge sharing" mindset
Head of Digital Channels Development at Nedbank Moçambique
8yGreat post. It shows how careful we need to be when building our data infrastructure layer. Understanding how the things works really help us to implement patterns consciously and not just "because it's a pattern". thanks for sharing.
Executive Head | Digital Field & Engineering Enablement at Vodacom Mozambique
8yThere is. But i wouldn't recommend it. I do recommend DTOs. Jackson provides an annotation to skip on serialize, but I think people only win when they introduce DTO because it abstracts the persistence infrastructure, although it kinda forces you to re-write some classes, but when you find yourself re-write the same classes, you are failing in your DTO creation
Software Engineer at Factorial HR
8yReally awesome. This gave me a better understanding of object proxies, as they are used in other persistence frameworks for other languages as well. "Objects returned by Hibernate must not be JSON serialized". With Entity Framework, you can add an annotation to tell the JSON parser how "deep" it should go when loading related entities to avoid this problem with circular dependencies. So there might be a way to do this with java too