Image of Code optimisation made simple

ADVERTISEMENT

Table of Contents

Introduction

For the everyday coder, this post is for you.

Most of us finish college and end up working jobs in Software development, writing lines upon lines of code. We usually don't get to choose what we build as that decision is up to the customer. After all, they are paying us to code. We are given some sort of problem description from our Manager and expected to go away and implement a solution for it, a-ok no problem. That might sound boring, but don't worry because it isn't. There is opportunity to become incredibly creative with the build process, through improving your everyday code.

Take a brute force approach first

To improve your everyday code, we must first start with actually having some code to improve. A brute force approach is always best to start with. What does this mean? Just solve the problem without thinking how to solve it optimally. This approach can be taken with just about anything, for example, say you were a front-end developer building an aesthetic user-interface for a client. First, build all the required panels, widgets and all other items onto the UI. Afterwards, you can start optimising the UI by re-positioning components for a better user-experience or performing an extensive analysis through user-testing and survey research. The point is though We have something to work with before making it perfect.

Optimising everyday code

What is everyday code? The code that is boring to write, but has to be done. Let's say, for example, we have been given a task from our Manager to search through a Student directory of Students and find a Student ID number, then print out their name to the screen. Below is an example of this using Java. If you're not so familiar with Java, check out my earlier post here for a brief introduction.

public static void main(String[] args)  {

    // Initialised when program is started
    int targetStudentID = Integer.parseInt(args[0]);

    ArrayList<Student> studentList = new ArrayList<Student>();

    // Static method that fills the list with Student objects
    populateList(studentList);

    for ( int i = 0; i < studentList.size(); i++ ) {
        if ( studentList.get(i).id == targetStudentID ) {
            System.out.println(studentList.get(i).getName());
        }
    }

}

Optimising is about reducing the compute time wherever possible, thus speeding up your application. So what are some ways to do this? Well let's take a deeper look into what we have above, going line by line. For simplicity, let's re-number each candidate line for optimization.

  1. targetStudentID - It receives the program argument as a Student ID number, converts the value to an Integer type which represents the Student ID. For example, "111222333".

  2. studentList - A List implementation in Java. It can store Student object types.

  3. populateList(studentList) - This method populates the list of Students. I left this out intentionally, but we can assume that there is some type of call to a database to extract the list of students.

  4. for(int i = 0; i < studentList.size(); i++) - Iterator that goes over all Student objects in the list.

  5. if(studentList.get(i).id == targetStudentID) - Comparison check on the Student ID.

  6. System.out.println(studentList.get(i).getName()) - Printing the name of the Student with the targetted ID.

Where can we make some improvements? Well, there isn't too many improvements we can make to '1' that will create a noticeable difference. We can look for a better implementation of parseInt(), or re-write the inbuilt method but it probably already performs just fine. Next up '2', studentList is an ArrayList which is the Java implementation of the List data structure. Optimisation opportunity - we can switch out the List data structure for a Hash Table data structure. This will solve our problem in a more optimal fashion. Why? Because instead of having to search through the List of Student objects, taking at most O(N) time where N is the number of Students. We can reduce this to O(1) time, for accessing the targeted Student. You can read more about the benefits of Hash Tables here.

What about '3'? There is an assumed database call to retrieve all the Student information from a database. Since we are using a database we inherit all the issues that come with accessing data from a database. Such as, I/O time, network performance and concurrency. What if we put in place a memcached server to cache the Student objects? We could also scale up the server that is hosting the application, eg. a single 64-bit server with 48GB, and will put up to 80,000,000 items in it. That should be enough right? Remember we need think longevity when building applications.

A memcached server sure will increase the performance of retrieving Student information instead of reading it directly from the database. We can just request for the Student by Student ID via the memcached server, and print the returned Student name. That should improve the performance by not having to drag the whole Student directory in, much faster!

Let's look at '4', '5', '6' collectively since they are the crux of the for loop. We are using a List, so we must search through the entire List and access each object to find out if the ID is the same as the requested Student ID. That is extremely slow. What if the Student we were looking for is at the very end of the list. The best we can do here, is add a break; statement in so that once we find the Student we were looking for, we stop our search. In turn, making the program finish earlier.

An improvement

public static void main(String[] args) {

    // Initialized when program is started
    int targetStudentID = Integer.parseInt(args[0]);

    HashMap<Integer, Student> studentMap = new HashMap<Integer, Student>();

    // Static method that fills the map with Student objects
    populateMap(studentMap);

    // Looks up the map in O(1) time to get the Student object
    System.out.println(studentMap.get(targetStudentID).getName());

}

We simply swapped out the List data structure for a Hash Table implementation. This optimises our first solution and reduces the running time of application significantly. Through using the Student number as a key for looking up the associated Student object, we can sure see a reduction in "search" time. This implementation also performs well at scale. For example, 5 years have passed and now the school has fucking way more students. Looking at our brute force implementation, it would take even longer to search through that List and find that Student. Stick with solutions that scale well.

Conclusion

Utilising this optimization process will help you build better, faster applications that scale well with growth. Some things can't be optimised, so start with a brute force approach first to get something out quick. Then, dissect your code like a surgeon until every line is the best it possibly can be. These techniques will help you become a better programmer and will impress your customers through the quality of the products that you can build.

Final Notes