Friday, October 10, 2014

Good programming is an optimization problem

When it comes to real world software, I've come to realize that most design decisions seem which make perfect sense at one point in time seem stupid and useless after 'x' amount of time passes. The variable 'x' may differ from product to product, but because of a number of factors which affect businesses, software changes and these changes introduce complexity and the original design is lost or has altered dramatically. The application simply grows larger and larger and not always in the direction that the original design assumed.

The reasons for this situation are many - business requirements change over time, new unexpected features may need to be added to the application which may introduce feature creep and code rot. Eventually, the application will be re-written (when the cost of a rewrite becomes less than the maintenance cost or for other reasons - the system is too complex and it's hard to adapt it to new, incoming requirements).

Good programmers optimize the time between rewrites, they structure and set things up so that the design is general enough to tolerate a large amount of feature creep, that is scales well with respect to features that might come in (yes, even those which cannot be predicted at the beginning). Eventually even their programs will be re-written. But it will be a really long time after which the rewrite will happen.

Optimizing time between rewrites is one of the hardest skills to acquire because it takes a sheer amount of practice and experience designing large and complex systems. Hire programmers who can do that and your company can add value much faster than it adds cost.

Saturday, June 14, 2014

Programs are programming languages

"When someone tells you, here is a new programming language, your first question shouldn't be,  'Well, gee how many characters does it take to invert a matrix?' , rather you should ask, 'If the language did not have matrices built-in, how would I go about implementing them?' "
    - Harold Abelson

This blogpost is about one of the techniques we have, to conquer complexity as programs evolve and grow. The technique of abstraction.

Let's start with the basics:
A variable is the simplest abstraction that a programmer can use.  What exactly is a variable? When you see a statement like

var rate_of_interest = 15;

what you have really done is, you have given a name to a numerical quantity, in order to avoid typing that quantity everywhere you need to use it. It is a simple abstraction which has a couple of uses. The first one is obviously the fact that your program is much easier to read and understand because of the good variable name you have used. It's much more enlightening to read:

var final_amount = (principal * rate_of_interest * time_period)/100;


var final_amount = (principal * 15 * 4)/100;

The second thing which you have gained is the ability to modify your program in one place than in 20 places where that value is used. So, that's the same advantage you gain when you follow the  'don't repeat yourself' principle when coding. But these advantages all emerge out of one thing we normally don't even think of, because it's so common - we have just built an abstraction. The kind of abstraction where we don't care about what something looks like, but only about it's behaviour. Nobody cares about which memory location the variable gets stored in when they declare a variable. Nobody has to. The same principle applies to all primitives and the means of combination a programming language natively provides you.

The most interesting part about these simple primitives and the means of combination, is that they can be used to create complex primitives, which can be given their own means of combination. These complex primitives can themselves be made use of in a much larger system as simple primitives.

The previous paragraph is mostly theory, so lets see some code. I use JS to illustrate what I mean.
JavaScript provides a means of  combining things to create complex primitives in the form of JavaScript Objects. So, if you wanted to store a student's information, you would create a JavaScript object that might look something like this:

var student = {
    name: 'Tyrion',
    age: 17,
    email: 'abc@xyz',
    major: 'history'

(Actually, you would have a class with a constructor that created one of these things for you when you asked it to give you a student object, but for sake of brevity, the above example is good enough). You could have stored the same information in four different variables. But of course, you would never do that. The reason why you would never do that is because, you know better. There is an abstraction here which is the fact that you can meaningfully talk about 'student' objects and explain something to the other person (or the computer) which you would lose if you had variables that stored the above information like:

var student1name = 'Tyrion';
var student1age = 17;
var student1email = 'blah';
var student1major = 'history';

In a system with had to maintain data of twenty students of a classroom, instead of having 20 'student objects', you would have 80 variables hanging around with no easy way to distinguish or use.

The next level of abstraction is what Object Oriented Programming is all about:  when you look at an object from outside the object, all you must see are it's public functions. This abstraction is one level higher than the abstraction we have seen so far in my examples. (This is one of the concepts that I took a long time to grasp when I was introduced to object oriented programming initially.)

To illustrate the principle in code, consider you have something like,

var student = {
    name: 'Tyrion',
    age: 17,
    email: 'abc@xyz',
    major: 'history'

and you had an array of such student objects called 'students', now suppose you wanted to know how many students were majoring in history, you would have a loop such as:

var history_majors = [];
for(var i = 0, j = students.length; i < j; i++) {
     if(students[i].major === 'history') {

Would you do something like this? Of course you wouldn't!  you are better than that. Well, why not?
suppose a crazy programmer decides tomorrow to change the way student objects are represented, he decides to make all the fields in the "student" begin with an uppercase letter. (It's a contrived example, I know, but bear with me), then your code would no longer work because it's logic is intimately tied to the structure of the objects that it uses. In other words, it relies on the fact that a field called 'major' exists inside the student object.

 This violates the principle of information hiding completely. To fix this, what you would do is, you would have methods that returned you the information you needed and student objects would expose those methods to you. So, instead of saying,


you would say,


where getMajor is a method that the student object gives you to use (it's public api) that returns the major of the object it is invoked with. This simple change is enormously important because having methods like these allow you to change the way you represent data from the way you use that data. This means you can go ahead and choose a specific way of representing data and have the choice to change it whenever you like and all your code would still function without breaking. Also, note that the student object does not provide you with any means of combination. You cannot combine two student objects in any meaningful way. When you have things that can be combined meaningfully, like say,
a vector v1 and a vector v2, then the interfaces that a vector exposes would look something like:

v3 = v1.addVector(v2).

So, you don't have to know how the vector is added, only that v1 has a method that you can use, to add another vector to it, to obtain a new vector. Again, this is an example of information hiding and exposing a means of combination, and of using complex data objects as primitives in larger programs.

You can now see a very specific pattern that we have developed to create and use abstractions well.
We can apply closure to the means to arrive at abstractions themselves! in other words, this principle of separating the data representation from its usage can be extended to whole modules, where every module might consist of a hundred different things which have controlled apis and each of those things is a compound entity which has a public api which other entities in the module use. Now the module itself has some public apis that it provides you and you make use of that in your programs to build more complex entities.

Inheritance, and Polymorphism in Object Oriented Programming,  build on this basic premise and impose special semantic rules for minimizing code duplication and for easy extensibility. A basic requirement when you use Inheritance is that a parent object can be passed to any expression or a method call or anywhere really, where a child object can be passed. This is a direct consequence of having isolated the representation of the data from it's usage.

It's important to structure your programs using these principles of information hiding because your program will eventually end up being used as a module or as a primitive in some larger system.

So, in a way, all programs are programming languages for the programs they are going to be used in.

Sunday, April 20, 2014

Lessons learned after (nearly) a year of professional programming.

I turned 23 yesterday (I'm old. I know) and it's nearly a year since I started writing code professionally. During this time, I learned a lot of things about programming. Here are some of my observations.

1. Code is for people, binary executable is for the machine: This is a recurring theme which one reads about and something everyone knows but its so hard to remember this principle while coding and get carried away. In two weeks, no one, including the author of the code remembers why the code was written the way it was. I am not talking about commenting code here. Although its important to comment your code, a sign of a good programmer is his/her ability to express themselves as clearly as possible with their code alone.  Comments should be minimal, and code itself should tell its story most of the time. 

2. Respect your data types: Store strings as strings, numbers as numbers, ObjectIds as ObjectIds and dates as dates, avoid *manual* conversion of types from one form to another as much as possible because it only leads to pain and unnecessary code in the form of conversion routines. A bug in a conversion logic can be catastrophic.

3. Interface before Implementation: Write down what interfaces your entities will expose without worrying about how they are gonna do it. Its hard to keep these two facets separate when thinking about code but its an important skill which you have to develop. Think about interface first, and write it down because its volatile and a minor change can impact anyone who is relying on that interface, also, get a sign-off from all those who develop/rely on that interface.

4. Beautiful Code when you can, Beautifiable code when you can't: The ideal code contains zero 'if' statements, no loops and can be read like a novel. It doesn't use globals, doesn't maintain much state anywhere and exhibits all the characteristics of good code you  have ever read about, anywhere. However, due to constraints of time, development that happens elsewhere which you rely on, the libraries and frameworks you use, its almost impossible to achieve the 'ideal code' which everyone desperately wants. When you are in complete control of all the environment variables, write beautiful code, when you are not, write code to finish the feature, however, ensure that it can be 'beautified'. That is, your code must be easily refactorable even if its not completely refactored.

5. Decide on suitable defaults beforehand: Variables will be undefined, uninitialized, database columns will hold nulls or will be non-existent,  parameters passed in may be null/undefined. In all such cases, its important that your code function properly and not choke. Specifying default behaviour in all cases will almost always be impossible, however, you should identify the most common cases where there is a high likelihood of something like that happening and define application behaviour. It should be documented, preferably with comments. Timezones, internationalization and unicode support are some of the things you should worry about before writing a single line of your application code. Also, do not mess with usernames! make no assumptions about them. All bets are off if you do!

6. Workflow is a habit: One of the most time-consuming aspects of writing code relates to the workflow which your team follows. Its not uncommon to get into all sort of troubles with your development environment and version control you use (use git. Its awesome.) and spend hours trying to resolve those problems. The time spent doing that cuts into the time which you could have spent thinking about how you could structure your code better( or playing 2048 while taking a break). Sometimes you have to get your environment back to a sane state, it happens. But, if you make your workflow a habit you follow religiously, your "muscle memory" takes over and you find yourself producing code in such a way as to cause minimal disruption for yourself and for others. A few things I follow obsessively are: committing or stashing stuff frequently and rebasing the hell out of my commits before I push code.

7. Integration implications: It's important to clearly specify how a new feature fits into the overall application. What are its side-effects on already available features of the system? How easy is it to use? Who benefits most from it? What is the risk involved? Will the application exhibit unintended/ seemingly inconsistent behaviour because of this addition? These are the questions you should ask and answer even before deciding the color of the button which invokes the new feature. The sooner you get these details straight, the better your resulting code is going to be. Its imperative to answer these questions before starting the design because any change which needs to be accommodated (with respect to the above posed questions) after the design is finished will be costly to fix.

8. One way: Perl programmers will tell you "there is more than one way to do it". Pythonistas like myself laugh at them. In Python, "There should be only one straightforward way to do it." This is a powerful belief which forces you to think about the best way to implement something. It eliminates  unnecessary trains of thought that lead you away from the best solution. People disagree with me on this, but I believe its important to assume there is a right and most straightforward way to accomplish something and it alone must end up as code. 

9. Functional all the way: Functional code is better. I rest my case.

10. Master the tools you use: Learn how to use your editor efficiently, build your "muscle memory", and master the devtools you use. Debugging tools, diff tools and commands that let you profile something, measure stuff, code quality tools and other aids are essentials when you write code. 

The ideal state which we must aim for is to write good sub-consciously. The master programmer writes code without thinking about what he/she is writing. The fingers automatically hit the right keys and the code writes itself. Of course, what I am describing here is the highest possible state (something akin to a kung-fu master who just moves while fighting without thinking) this can only happen with masters, but its something that we must aim for.

These are some of the things I picked up after 11 months of writing production code, by watching how people (who are much better than I am) write code, and observing myself while I code. Coding is a craft and creating beautiful code is a pleasure in itself.

Saturday, March 8, 2014

What happened to User Experience

First of all, I am not a UI/UX designer and I have never been. So while what I am about to say may sound completely crazy, its my take on how the user interfaces have evolved over the years since I first interacted with my first hand-held video game. (Good times. eh? ) So, here goes... another blog on the inter-web about User Interfaces.

I can, with some difficulty recall the times when we didn't have smartphones (or any phones for that matter) in our pockets. When the first cell-phones arrived, the buttons and the incredibly (by today's standards) small screen on those phones looked wonderous. I could not help but marvel about them. How incredible the invention was! (I was in high school at the time).  The user interface was not on anyone's mind back then as far as I can tell, people cared about the functionality: how much could  they do with the device in their hands? They were willing to tolerate glitches, the device taking some time to accomplish a task while it displayed a loading spinner, and a lot more. People were patient back then.

And then, the iPhone happened. I am a fan of Steve Jobs but I admire the engineers who worked on the iPhone more. As an engineer, I know, I just know that implementing the ideas that Steve Jobs had was non-trivial and the technical constraints, not only the things related to software, but the manufacturing constraints of the day made their task so much harder. Anyway, the point I was trying to convey is the fact that, once the sleek iPhone hit the stores and people had enough money to afford it, something changed.

Users started expecting the same level of sophistication in other devices they use. The phone worked so well and so quietly and smoothly that it revolutionized the concept of 'responsive design' and 'user experience' forever.  What Apple did was to raise the bar so high in terms of usability that the rest of the technology companies had to follow them to the summit or die slowly.

I don't think android would have ever had an interface like the one it currently has, if it were not competing directly with iPhone. This had a trickle down effect on the rest of the software users use on a daily basis. The webapps which people use today are much much better than how they were a few years ago in terms of usability. We expect and take for granted killer graphics in webapps today. This is a direct consequence of users' expectations of  usability.

The reason why this shift is important to keep in mind while designing and developing software is that, you are supposed to make whatever new software you create, beautiful. The key term to keep in mind here is the word 'beautiful'. There is no other word that can substitute it. Your  new software should look seductive, respond to the slightest touch, anticipate the user's actions beforehand and be so pleasurable to use, that users can't help but get addicted to it. I sometimes come across buttons on
web-apps which I just love to press! The software should extract loyalty by being everything the user desires and leaving no room for its competition.

Saying so is easy. But nailing down a user interface is incredibly hard. It is almost impossible to achieve the perfect user interface for the software you are building because when your target audience is the whole world (why settle for anything less? ), every individual user has his/her own preferences. Common patterns arise however and designers typically concentrate on pleasing the largest fraction of the populace for the largest possible time. Designing something beautiful is tricky and complex because what is beautiful to you may be crap for someone else. (I love command line interfaces and am a huge, huge fan of Linux but people around me who aren't geeks like me unanimously agree that command line interfaces suck. Since most of the world (in other words, the people you are serving) don't seem to like command line interface either, you have to find a better user interface to seduce them into using your software.)

The first thing I do when I write software nowadays is to ensure that its usability is no less than that of GitHub. I am a web developer, so I aim to make my apps as responsive, as beautiful and as elegant and as easy to use as what I consider to be the pinnacle of usability on the Internet : GitHub.
GitHub is the most usable website I have seen since the time I started using the Internet and paying attention to these things. Whoever the user interface designer of GitHub is, hats off to him/her! You rock! Others on my list are: Google, Gmail, Twitter,  Mozilla's website, WolframAlpha and perhaps a dozen other sites.

Beautiful User Interfaces are now the norm rather than the exception. Things change so fast in the world of software with new stuff arriving everyday trying to woo the user, that it is just impossible to have software that looks ugly or is hard to use and succeed. Your software may be much more secure than anything that's out there, it may encompass more functionality than what the twenty apps on the user's phone combined can provide, but if your user interface is even a little hard to figure out, you've lost the chance to impress the jury. So, how does your app look when the browser is re-sized?

Friday, April 26, 2013

Time In Programming Languages

One of the most remarkable aspects of programming is the way in which time is modeled in our programs. I try to explain the way in which time is handled in functional and non-functional languages in this blog post.

What is a variable?

A variable in an imperative language is a named memory location. 

that is,  in a statement like

                       int a = 37;      //in c and c++ and Java

'a' denotes a named memory location which can hold a value. That memory location gets updated each time we assign a new value to a. That is,

if you say,
                  a = a+1;

a now holds the value 38.

(Note that in languages like python, variables are not typed but values are. So if you were to say a = [1, 2, 4] and later say a = 20 , the principle is still valid. That is, 'a' refers to a memory location which can be updated and accessed. The difference is:  a is not constrained to hold objects of a single type.)

What does this have to do with time?

As it turns out, everything. If you do not have assignment in your programming language, your programming language becomes purely functional in nature. That is, it loses the ability to model time.

Let me illustrate that with an example from Lisp where assignment is discouraged.

     computing a factorial of a number in Lisp

        (define    (factorial    n)
            (define  (fact-iter   fact   n)                      ;;inner function for making the process iterative
                   (if (or (= n 0)  (= n 1))   fact            ;;return fact when n becomes zero or one
                        (fact-iter (* fact n) (- n 1))))        ;;recursive call
          (fact-iter 1 n))                                           

In this factorial function, we have no variables to which anything is assigned.  No assignment is necessary. If you were to write the factorial function in C or its descendants, it would look something like this:

int factorial(int n)
       int fact = 1;
       for(int i = 1; i <=n ; i++)
             fact  *=  i;
       return fact;

Notice that there is assignment in almost every statement. 

What is a variable in  a functional language?

In functional languages, a "variable" stands for  a value. That is, you must stop thinking about a variable as a location in memory somewhere that holds a value.  In fact, you must think of the variable as a "shorthand".

So instead of typing 3.14159 each time, you alias it by saying

(define Pi 3.14159)

in Lisp. There is no concept of updating the value of pi to a different value "later on". Why? because "later on" doesn't even make sense when you have no time.

It still isn't clear what I mean when I say time doesn't exist without assignment. So let me explain further: When you have assignment, you are updating a value in a memory location somewhere. So if you were to call a function with a variable as an argument, it will return a result. If you now update the variable's value and call the same function with the same variable passed in as argument, you now get a different result. That is, there are points in time when you get different results. And the reason you get different results is because you have assigned a different value. So time comes into play. There is a distinct concept of "result before assigning the new value" and "result after assigning the new value".

What happens when you get rid of assignment?

If you have no assignment, it means that variables truly are the values they alias. So, no matter how many times you call a function with a variable, you always get the same answer. If you want to get a different answer, you call the function with a different value (variable). Note that "before and after" don't exist in this scenario. 

Why is time or the lack of it important?

Well, if you don't have time in your language, then the programs you write will be mathematical in nature. They will be akin to mathematical functions like f(x) = x^2 or f(x,y) = x+y which specify a distinct mapping. So they exist "timelessly" which means, there are no synchronization errors. Also, the order of substitutions don't matter. What do I mean by that? well, consider the sequence of statements:
1. i = 1;
2. j = 2;
3. j = i +1;
4. i  =  j + 1;

If  I interchange statements 3 and 4, I get different values for i and j. So, the order of substitution matters. However, in the factorial procedure written in lisp, in the line:

(fact-iter (* fact  n) (- n 1))
the order in which I substitute the value of n doesn't matter. Because n is the same throughout. If n is say,  5, the expression becomes:

(fact-iter  (* fact 5) (- 5 1))

On the other hand, if you have time in your programming language, then the order of statements matters and your programs will have 'state'. As it turns out, having state in a programming language leads to some horrible things like worrying about synchronization when you have multiple threads or when you are running  parallel algorithms. And you get a lot of bugs if you update the wrong variable first. 

The advantage of course is that, you have the power to represent and model the time [which we observe in the real world] in your computation. There are some situations where modeling time is of immense importance: how would you model the transactions of a  bank account in a purely functional language for example? The time of transactions is tied to having the correct balance.

There seem to be situations which purely functional languages cannot handle. So even though Lisp is considered a functional language, it provides an assignment operator called SET!. And Lispers are careful not to use it too often. The challenge is to retain as much functional nature as possible while admitting state into our programs.

Why did you write this post?

Nobody explained to me the consequences of having an assignment statement and how it relates to time. In fact, I had not even thought about it. Luckily, I read Structure and Interpretation of Computer Programs and watched the Abelson and Sussman videos which explained what the consequences of having an assignment statement in a language were. I hope readers of this post see assignment in a new light. And I am fascinated at how a simple thing like an assignment statement in a programming language can raise questions about a deep concept we call time. Perhaps things would be different if we all existed in some timeless eternal universe...