Object-Oriented Programming (OOP)
Overview
- Object-Oriented Programming (OOP)
- Inheritance (OOP)
- Estimators
- Transformers
- Custom Estimators
- Pipeline
- Common Scikit-learn modules
What is Object-Oriented Programming?
A style of programming that emphasizes the use of objects to represent and process data in a program.
Basic Concepts
- Object (Instance)
- Class
- Property (Attribute, Field, Feature)
- Method
We will use an example to motivate the use of the OOP paradigm.
Example:
Suppose we are required to write a program to simulate the interactions between users on a Social Media platform.
For the basic requirements, we need to be able to:
- Represent each user data (
username
,birthdate
,friends
,posts
). - Add a new
friend
- Publish a
post
- Like a
post
Basic Solution
A basic solution would be to store each user's data as a dict
, and use functions to manipulate the data.
{
'username': 'john_doe',
'joined_date': 'YYYY-MM-DD',
'friends': [...], # list of usernames
'posts': [
{
'title': 'Post 1',
'text': 'A new post',
'likes': [...] # list of usernames
},
... # other posts
],
}
Requirement 1: Represent user data
def create_user(username, joined_date):
user = {
'username': username,
'joined_date': joined_date,
'friends': [], # no friends for new user
'posts': [], # no posts for new user
}
return user
>>> johndoe = create_user('johndoe', '2015-04-20') # creating a new user dictionary
>>> johndoe
{
'friends': [],
'joined_date': '2018-04-20',
'posts': [],
'username': 'johndoe'
}
>>> mikesmith = create_user('mikesmith', '2020-10-31') # creating a new user dictionary
>>> mikesmith
{
'friends': [],
'joined_date': '2020-10-31',
'posts': [],
'username': 'mikesmith'
}
Requirement 2: Add friend
def add_friend(user1, user2):
username1 = user1['username']
username2 = user2['username']
user1['friends'].append(username2)
user2['friends'].append(username1)
>>> add_friend(johndoe, mikesmith) # adding friends
>>> johndoe['friends']
['mikesmith']
>>> mikesmith['friends']
['johndoe']
Requirement 3: Publish post
def publish_post(user, post):
user['posts'].append(post)
>>> post = {
... 'title': 'Post 1',
... 'text': 'A new post',
... 'likes': []
... }
>>> publish_post(johndoe, post) # publish a new post
>>> johndoe['posts']
[
{
'likes': [],
'text': 'A new post',
'title': 'Post 1'
}
]
Requirement 4: Like post
def like_post(user, post):
username = user['username']
post['likes'].append(username)
>>> like_post(mikesmith, post) # liking a post
>>> johndoe['posts']
[
{
'likes': ['mikesmith'],
'text': 'A new post',
'title': 'Post 1'
}
]
OOP Solution
The Basic Solution already solves the requirements for the program.
In fact, what we have done is conceptually inline with the OOP paradigm.
Recall:
Object-Oriented Programming is a style of programming that emphasizes the use of objects to represent and process data in a program.
What is an Object?
In simplest terms, an object is a data type
or data structure
.
string, integer, boolean, list
are all objects.
Even the dict
we've been using the represent the user data is an object.
There are two type of features that make objects powerful:
- Properties - variables that belong to an object. These variables that are accessible only through the object.
- Methods - functions that are bound to, and interact specifically with the object.
Conceptually, the key-value pairs in the user dict
are like an object's properties.
And the functions we've defined to manipulate the user dict
are the methods associated with that specific type of data structure.
However, in order to create an object, we need to define its structure. This is accomplished with a class.
What is a Class?
A class is a blueprint of an object's structure.
Just as a house requires a blueprint that defines its structure, an object requires a class in order to be constructed.
Specifically, a class defines the Properties and Methods that its objects possess.
To define a class, we use a special keyword called class
:
>>> class User:
... pass
And we create an object (aka instance) of the class by calling it like a function. This is called instantiation.
>>> user = User()
Note:
pass
is a special keyword in python for avoiding a common error that occurs when there is no code within an indentation block.
SyntaxError: unexpected EOF while parsing
Right now the user
object does not have any properties defined.
We can assign properties to an object using the <object>.<property>
syntax to access its properties:
>>> user.username = 'johndoe'
>>> user.username
'johndoe'
Note:
Attempting to access a property that does not exist on an object will result in an error.
>>> user.name
AttributeError: 'User' object has no attribute 'name'
We can even combine the process of instantiating the object and initializing its properties into a single function create_user_object
.
This ensures that every User
object we create has the expected properties defined when we use the create_user_object
function.
def create_user_object(username, joined_date):
user = User()
user.username = username
user.joined_date = joined_date
user.friends = [] # no friends for new user
user.posts = [] # no post for new user
return user
>>> johndoe = create_user_object('johndoe', '2018-04-20')
>>> johndoe.username, johndoe.joined_date
('johndoe', '2018-04-20')
>>> mikesmith = create_user_object('mikesmith', '2020-10-31')
>>> mikesmith.username, mikesmith.joined_date
('mikesmith', '2020-10-31')
Now that we've looked at properties, let's move on to methods.
Methods are functions that are bound to a particular class and are used by objects of that class.
We define a method on a class like this:
class MyClass:
def my_method(self, ...):
pass
And call it like this:
>>> my_object = MyClass()
>>> my_object.my_method(...)
It is similar to a function definition except for 2 notable differences:
-
The definition exists within the indentation block of the class definition.
- This means that most of the code relating to the class are bundled up in the class definition.
- It's now clear the method is meant only for objects of that class.
-
The first argument is always the current object calling the method. By convention, the name of that first argument is called
self
.- We don't have to explicitly pass in the object for its methods to gain access to it.
class User:
# Requirement 2: Add friend
def add_friend(self, new_friend):
username = self.username
friend_username = new_friend.username
self.friends.append(friend_username)
new_friend.friends.append(username)
# Requirement 3: Publish post
def publish_post(self, post):
self.posts.append(post)
# Requirement 4: Like post
def like_post(self, post):
username = self.username
post['likes'].append(username) # append is a method on the `list` object
# Requirement 1: Represent user data
def create_user_object(username, birthdate):
user = User()
user.username = username
user.birthdate = birthdate
user.friends = [] # no friends for new user
user.posts = [] # no post for new user
return user
>>> # Requirement 1: Represent user data
>>> johndoe = create_user_object('johndoe', '2015-04-20')
>>> mikesmith = create_user_object('mikesmith', '2020-10-31')
>>>
>>> # Requirement 2: Add friend
>>> johndoe.add_friend(mikesmith) # instead of add_friend(johndoe, mikesmith)
>>> johndoe.friends
['mikesmith']
>>> mikesmith.friends
['johndoe']
>>>
>>> # Requirement 3: Publish post
>>> post = {
... 'title': 'Post 1',
... 'text': 'A new post',
... 'likes': []
... }
>>> johndoe.publish_post(post) # instead of publish_post(johndoe, post)
>>> johndoe.posts
[
{
'likes': [],
'text': 'A new post',
'title': 'Post 1'
}
]
>>>
>>> # Requirement 4: Like post
>>> mikesmith.like_post(post) # instead of like_post(mikesmith, post)
>>> post
{
'likes': ['mikesmith'],
'text': 'A new post',
'title': 'Post 1'
}
Now that we have moved most of the code into the User
class, it is now easy to see which methods an object can use.
Depending on the text editor or IDE you're using, you could even get auto-complete features.
It would be nice if the process for instantiating and initializing a User
was also bundled together in the class definition with the rest of the code.
If only there was a way to automatically initialize the properties of an object at the same time when we instantiate it via MyClass()
.
Fortunately, Python provides a solution for this.
There are a set of special method definitions that Python watches out for in a class.
If present, these special methods enhance the functionalities of the classes that define them and their objects.
These methods are commonly referred to as dunder (double-underscore) methods,
due to their naming convention (def __methodname__(self, ...)
).
One of those special methods is the __init__
method.
Whenever we construct an object (i.e: calling MyClass(...)
),
Python automatically calls the __init__
method in the background after the object has been instantiated.
>>> # When we do this 👇
>>> my_object = MyClass(...)
>>> # Python does this 👇 for us if __init__ is defined
>>> my_object = MyClass()
>>> my_object.__init__(...)
We can now move the initialization process for User
objects from create_user_object
to the __init__
method.
class User:
# Requirement 1: Represent user data
def __init__(self, username, joined_date):
# user = User() # we don't need this
self.username = username
self.joined_date = joined_date
self.friends = [] # no friends for new user
self.posts = [] # no post for new user
# return user # we don't need this
# Requirement 2: Add friend
def add_friend(self, new_friend):
username = self.username
friend_username = new_friend.username
self.friends.append(friend_username)
new_friend.friends.append(username)
# Requirement 3: Publish post
def publish_post(self, post):
self.posts.append(post)
# Requirement 4: Like post
def like_post(self, post):
username = self.username
post['likes'].append(username) # Note: .append is a method for `list` objects
Now every code related to User
is defined in the class definition.
Both the properties and methods are visible in the same location.
>>> johndoe = User('johndoe', '2018-04-20')
>>> mikesmith = User('mikesmith', '2020-10-31')
Conclusion
Naming Convention
The built-in classes in Python are typically in lower-case because they are used frequently and recognized by most Python developers as classes.
However, when defining custom classes, it is standard to use PascalCase
casing for custom class names
to ensure that readers of the code recognize they are classes at a glance.
Documentation
When defining a class it is advised to provide documentation using a multi-line string,
and to provide documentation for its methods
in the same manner.
>>> class MyClass:
... """Description and purpose of the class goes here.
... Also describe the properties that belong to this class.
... The rest of the class definition goes below
... """
...
... def my_method(self, x, y):
... """Description about my_method preferably talking about
... the purpose of the method and what its arguments are for
... """
... pass
Quick Tip:
if you call the built-inhelp
function on a class or object, it outputs the documentation for that class.
>>> help(MyClass)
Help on class MyClass in module __main__:
class MyClass(builtins.object)
| Description and purpose of the class goes here.
| Also describe the properties that belong to this class.
| The rest of the class definition goes below
|
| Methods defined here:
|
| my_method(self, x, y)
| Description about my_method preferably talking about
| the purpose of the method and its arguments are for
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
Hopefully through these examples (Basic Solution and OOP Solution), you now realize how powerful the Object-Oriented Programming paradigm is.
This style of programming will come up frequently as you advance in your programming journey.
As an exercise, you can try implementing a Post
class with what we've learned so far and integrate it with the current code.
Prev - Overview | Next - Inheritance (OOP) |
---|