Conceptualising equality and inequality in code
In many programming languages we can choose to use either != or <> to test inequality and both are valid syntax.
Although – assuming they are valid for the language – both ‘mean’ the same thing in programming terms and will have the same effect in the code, conceptually I got to thinking they are different and the one we would choose signifies something about the way we conceive of equality and comparison in the abstract.
<> is a less than (<) and greater than (>) symbol paired up and suggests something along the lines of ‘less than or greater than this, but not actually equal to’ and so “inequality” in this case is similar to “less than or greater than”
!= is a NOT operator (!) and equals (=) symbol paired up and suggests something more like NOT(equal) i.e. the opposite (true/false as the case may be) of the ‘equals’ comparison.
Do we think of inequality as the opposite of equality (2nd case) or as one of two conditions (first case)? This I think is the distinction between the two syntaxes (syntaces? synti? syntaxen? – not sure I’ve ever needed to write it before so I don’t know what the correct plural is and I suspect a Google search won’t be any more definitive!)
The way we conceptualise comparisons like this will probably also determine how actual parts of the code are written and so the actual implementation. For instance if we want to iterate through a bunch of items and stop if a particular item is found, we could write something like the following:
while (! found)
or
while (found == false)
or
while (notFound)
all of which would test the same thing but are different ways of thinking about the same concept (and in a more complex case, more or less prone to error!)
The second one above (testing whether variable == booleanValue) I try to avoid as I generally think of it as bad practice for the reasons that:
- It’s usually more prone to bugs (using = instead of == and thus automatically assigning FALSE to the variable instead of testing it, etc… I won’t mention how many times I have erroneously done this!)
- It carries out a superfluous comparison (we can already access the value of variable directly which is TRUE or FALSE, we don’t need to test it again to see whether it is equal to that)
- Seems to show a lack of understanding of how evaluating TRUE and FALSE is actually carried out (I try to demonstrate through code I’ve written that I understand how it is working ‘behind the scenes’ though not to the extent of over-optimisation that just makes things obscure!)
I’ve come across a similar thing in Excel formulas where many people seem to do unnecessary steps for testing conditions and marking records etc. For instance if I have a table with ‘Name’, ‘Gender’ and ‘Age’ and I want to identify all the rows that relate to a man aged over 50 (e.g. to filter the sheet) I need to test the contents of the ‘Gender’ (column B, say) and ‘Age’ (column C) and combine them accordingly.
We can put something like:
= if(and(B2=”Male”,C2>50),”*”, “”)
which would have the effect to put in a * (or whatever marker we want to put, e.g. “Male over 50″) to any relevant rows and an empty string to the others – we could then filter on the *.
Another (I think cleaner) way we could do this is put in the results of the AND operator directly and filter on the true/false value Excel returns:
=and(B2=”Male”,C2>50)
Depending on what manipulation is needed it’s often possible to get rid of extraneous tests such as IF, by using the TRUE/FALSE value of another function (e.g. =, >, ISERROR etc) directly.
In a simpler case I have even seen people do things like that when only testing one value per statement (e.g. if gender = male) which makes it really hard to follow.
The following is more straightforward and directly represents what is being tested:
=B2=”Male”
will produce a TRUE or FALSE accordingly.
The way people choose to implement things – whatever language they are working with or even something as (seemingly) ’simple’ as an Excel sheet – since formulas are really programming, actually – seems to say quite a lot about how they conceptualise them internally which is something to keep in mind when working with code other people have written, perhaps especially in a training (or debugging..) situation.
