Object Oriented Programming doesn't need Encapsulation
Note: I use the term ‘Encapsulation’ as a language mechanism for restricting access to some of the object’s components. Others might call this ‘information hiding’.
Traditional Object Oriented Programming
In every object orientend programming course or book that I know, you get introduced to a concept called encapsulation. Encapsulation is a technique that allows you to protect implementation details while exposing only the interface. The advantage is that the rest of the code can only use the interface, and therefore is not dependent on the hidden implementation.
The thought behind the whole idea of encapsulation is that when implementing a class, you protect it against misuse. You make sure that the user can’t break it. You have total control over your class, and the user can only use that what you allow him to use. Another advantage is that providing a stable interface will protect implementation details that are likely to change, and therefore limiting interdependencies between software components. That is, it forces the users to do so.
Most Object Oriented programmers totally agree with all of the above, and I used to too. But that was until I met python.
OOP without Encapsulation?
Python supports Object Oriented Programming, but it doesn’t support encapsulation. Now how is that possible? Various OO programmers asked the same question in the python user group, and the end conclusion is pretty simple: you don’t need encapsulation for OO. In python everything is public, you just put and underscore before members or methods that are not part of the interface. If the user is going to use them, it’s at their own risk. Python programmers refer to it as “encapsulation by convention rather than enforcement”. But essentially it’s not encapsulation or data hiding at all. If you ask yourself why this would be an improvement, you probably don’t know that in some rare cases, it is necessary to access the implementation. Let me explain below.
Using the ‘computer’ metaphor
I have this computer on my desk, and it’s a nice piece of engineering. It has a public interface which is clean and simple; On the front it has 2 buttons “Power” and “Reset”, and a few leds to show if it’s on or if the hard disk is accessed. On the back it has connectors where I can plug in the power, network, keyboard, mouse, screen, etc. .
When it’s broken or I want to upgrade, I can just buy a new one and it will provide me with the same interface.
What my computer doesn’t use, is proper encapsulation. Remember that encapsulation is restricting access to some of the underlying components. But with my computer, I can pop open the hood and do anything I want. The hardware manufacturer didn’t restrict my access to the underlying components, and I’m glad he didn’t. The thing is, when I open the computer case, I know 2 things:
- I might break stuff
- Components inside the computer will probably change when I buy a new one.
This is exactly what encapsulation is trying to prevent. But you know what, I want to have a choice. I want to be able to put a new hard drive or silent fan in there, and yes, I know the risks, and know a new computer probably won’t support them anymore. But as a user, it’s my responsibility to make the correct decisions for solving my problems.
So what would happen when the hardware manufacturer did use encapsulation. Well, then he would make it extremely difficult for you to open the case, and anytime you want something changed you need to go back to the shop.
Encapsulation restricts access to implementation details. While I’m not saying you should access implementation details, sometimes it might be handy to do so. With hardware I’m glad they don’t restrict me opening up the hood, so why would I like it in software?
Abstraction instead of Encapsulation
One definition of abstraction states: ‘the act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or actual instances.’. In programming terms this means that you provide an interface on top of an implementation, but the user only needs to know the interface (abstraction), and not care about the implementation details.
The difference with encapsulation is that when the user wants to, he can still access the underlying details.
Most hardware uses abstraction instead of encapsulation. They provide an interface that’s not likely to change, but still allow “power users” to pop open the hood and access the implementation.
When using abstraction instead of encapsulation, the user can clearly see the interface, and use it in the same manner as with encapsulation. Users are encouraged to use the public interface, but are not restricted to it. When only using the interface, abstraction has the same benefits of encapsulation. But when needed he can also access the implementation behind it. And because of the clear distinction between interface and implementation, he is aware of the risks involved (ie breaking things and future incompatibility).
A real world programming example
Let me show you how this might work in the real world of programming. Suppose I’m writing an application that uses a multiplatform GUI library. Multiplatform libraries are great because they make porting really easy. Their public interface is the same across platforms, and so the method calls don’t need to be adapted when switching platforms.
Now suppose the Windows users of our application are requesting a certain feature. That features is not supported by the GUI library, but it is available in the underlying Windows implementation of that library (because MFC supports it for example).
At that point you must take a decision: If you respect encapsulation, you are limited to the following things:
- Implement it yourself on top of the GUI library, but this could mean rewriting a whole GUI control yourself (duplicating both the code of the GUI control, and the code of the MFC feature you want).
- Contact the GUI library vendor to make it available in their public interface, but this also means they have to implement it themselves on all non-Windows platforms. In other words, you won’t have it available soon.
- Switch your windows code over to a Windows specific GUI library. This will cost you some serious time, money and headaces.
What if you would drop encapsulation and use abstraction? Well, in that case you still have the options above, but also an extra one. You’re able to access implemenation details, but then you will have to consider the risks:
- Accessing it is risky for breaking other parts of the GUI Library
- Updates of that library might change the way they use MFC, so you will have to double check it every time you upgrade
- Probably no official support for that from the Library Vendor
The thing is, by using abstraction instead of encapsulation, you do have a choice whether the benefits outweigh the risks. And in this case, that’s probably correct. But remember, I’m not encouraging anyone to use implementation details, but in specific cases, it might be the best solution.
How I do it
Because Python doesn’t support encapsulation, I don’t have to do anything specific to use abstraction instead of encapsulation. I follow the conventions of putting underscores in front of methods and members, to make the distinction clear between interface and implementation. My users should only use my public interfaces, but if they really need to, they can ‘pop open the hood’ pretty easily.
In more traditional object oriented languages, I use ‘public’ for my public interface, and ‘protected’ for my implementation details. You probably know the following expression from the ‘Gang of Four': “Because inheritance exposes a subclass to details of its parent’s implementation, it’s often said that ‘inheritance breaks encapsulation'”. That last part is exaclty what I need for my abstraction :). My users use the public interface, but if they really want to, they can access all implementation details by deriving their own version from it.
This also speeds up my development, because I don’t have to break my head over whether I should use private, package-private, protected or public. I use public for my interface, and protected for implementation details. I love to Keep It Stupid Simple.
Encapsulation is not Security
Sometimes it is necessary to protect or hide some data from the user. But in this case we are talking about security, and the worst way to implement security is through Object Oriented encapsulation.
Depending on implementation details is a bad idea. Therefore you should only use the public interface of a class. But for specific or unforeseen needs, it might be useful to be able to access implementation details. It’s the responsibility of the class to be as useful as possible, and the responsibility of the user to use it in a professional manner. Don’t treat your users like idiots, but treat them like professionals who make use of your class the best way possible, to solve their specific problems. If they to need access the implementation details, they probably have a good reason to do so, because else they wouldn’t. So don’t treat them like idiots by using encapsulation.