September Writing Challenge, Post 24: Three Things I Wish American Tech Culture Would Learn

Note: I’ve had a couple things holding my attention this week, and as a result missed a couple of days of the writing challenge. I’ll catch up.

One more note: I’m having a slightly rant day. Bear with me.

There are a bunch of things that could be done to make the tech culture more sane and humane. Here are three that rank highly on my list:

1. Working more hours does not necessarily make you more productive. In fact, it may make you far, far less so. We work in one of the few professions where it is possible to do negative work on a daily basis – that is, to make the code worse than we left it. We are more likely to do this when we work long hours. Unfortunately, both American work culture and the tech subculture seek to twist overwork into a virtue. It’s not. Overwork leads to bad decisions. If your boss doesn’t understand this, give him the slide deck I linked earlier in this paragraph (which contains a ton of researched information on productivity topics beyond just hours). If he willfully ignores the facts and says he doesn’t believe it, go work for someone smarter, and let him fail on someone else’s broken back. Also: If you think you’re somehow the exception to this, you’re not. There’s ample research out there – I urge you to look it up.

2. Trading sleep for work just makes you dumber, not more productive. This goes hand-in-hand with the issue of long hours; as with overwork, our culture makes a badge of honor out of sleep deprivation. (I was guilty of this myself when I was younger.) When we don’t get enough sleep, it degrades the quality of our work, and our ability to notice how much our work has degraded. This may be a reason so many people think they’re exceptional in this regard. Spoiler: They’re not. Again, there’s loads of research; Google is your friend.

3. The software profession is not a meritocracy. At least, it’s not if you’re black or a woman. This is made worse by the fact that white guys in the profession often think they’re too smart to have unconscious biases about race, gender, sexuality, &c. It’s made worse still by the fact that most of us in the profession who are any good at it actually did work hard to get there, and feel there’s merit in the rewards we’ve gathered. But if it’s not a meritocracy for everyone, it’s not a meritocracy for anyone, and those of us on the inside need to check our privilege and start examining our own behavior.

</rant>

September Writing Challenge, Post 22: The Customer Is Always Right

If you read yesterday’s post, the title of today’s might seem odd.

If you are designing or writing software for someone other than yourself, you are ethically bound to give your client or employer your best advice on how to meet the project’s goals. (Being persuasive in this is one of the big reasons you should cultivate your communication skills.) Whoever is paying for your skills will then take your advice, or they will not. It could go either way, for reasons that are largely out of your hands.

Whichever way that goes, you are then ethically bound to do it the way the customer wants. Their project, their money. Sometimes, that means implementing things – often user experiences, but sometimes deeper technical details – in a way you know to be somewhere on the scale from “suboptimal” to “imbecilic”. Sometimes, deliberately delivering less than the best possible product* at someone else’s request can fall on an emotional scale from “mildly annoying” to “soul-eroding”.

If that’s too much for you, you could go and make your own product. If you don’t have the savings cushion to take that leap, you could work on a side project on your own time – that can be a real sanity-saver (and a great way to sharpen your skills). And when all else fails, learn this mantra: “Not my circus, not my monkeys.” However you do it, it can be valuable to learn to distance yourself emotionally from work that does not belong to you, when the need arises. Software and business are both complex endeavors, and their intersection will always involve compromise.

Give your customer your best advice, then build the very best version of the thing they are paying for, and know that at the end they are getting what they asked for and deserve the results, for good or ill.

And, of course, make sure that best advice you gave earlier in the project is documented somewhere, with your name attached. Couldn’t hurt.


* If you’re working on medical devices or air traffic control software, and someone could end up maimed or dead, the advice in this post is less applicable. Buck up and push your case harder, in that event.

September Writing Challenge, Post 21: The Customer Doesn’t Know Shit

If you are designing or writing software for someone other than yourself, you’ll spend some about of time wanting to roll your eyes at people thinking they know how to do your job. Non-technical product managers will suggest specific technical solutions that they’ve heard of but don’t clearly understand. (MongoDB seems to turn up a lot in this context, for no reason I can discern.) Salespeople with no special background in UX design will prescribe inappropriate or outdated UI idioms. (Hamburger menus, a.k.a. pancake menus, still retain a lot of mindshare.)

Your natural and completely appropriate reaction to this sort of thing might be to want to start hitting people with a shovel, but this is bad for repeat business, and not legal in some places.

The thing you have to remember is that these nice people hired or contracted you (or your employer) because they don’t know how to do what you do, even if they sometimes forget this during requirements definition. You’re the expert. When a client falls into buzzword glossolalia, thinking he’s offering a valid technical solution, a better approach is to holster your shovel and say something like: “Let’s not get bogged down in technical details this early. Why don’t we take a step back, and you tell me what you want to accomplish by doing that, and then I can do a proper assessment of whether X is the right tool for the job.”

Gently draw the non-technical client’s attention away from the technical questions best left to you; bring your focus and theirs to the business problem they want to solve, and about which you may reasonably hope they know something. I’ve yet to have a client object to me caring about their problems and wanting to choose the best way to solve them. (Of course, then you’re on the hook to actually solve them, but that’s a topic for another post.)

There is, of course, a flip side to this, but I’ll write about that tomorrow.

September Writing Challenge, Post 18: The D in SOLID

Note: This is the fifth of five posts I’m writing on the SOLID principles of object-oriented programming. Part 1: S, Part 2: O, Part 3: L, Part 4: I

Any discussion of the Dependency Inversion Principle should start by answering the question: What, exactly, is being inverted?

A lot of object-oriented systems start with classes mapping to the higher-level requirements or entities; these get broken down and individual capabilities get drawn out into other classes, and so on. The tendency is to wind up with a pyramidal dependency structure, where any change in the lower reaches of the pyramid tends to bubble up, touching higher and higher-level components.

As an example, let’s think about the Service/Endpoint/Marshaller classes I discussed in my earlier post on the Single Responsibility Principle. It would be very easy to start writing the service class, decide to break out the Endpoint class, and do so in a way that made assumptions that you were calling an HTTP web service – for example, you might assume that all responses from the service would have a valid HTTP response code, or that parameters had to be packaged as a URL-encoded query string.

So what happens if your requirements change such that you must directly call a remote SQL store using a different protocol? You’re going to have to change at least two classes, because of assumptions you made about the nature of your data source.

With the Dependency Inversion Principle, we are told that first, we should not write high-level code that depends on low-level concretions – we should connect our components via abstractions; and second, that these abstractions should not depend on implementation details, but vice versa. I’ve seen the “inversion” part of DIP explained a few different ways, but what I see being inverted is the naïve design’s primacy of implementation over interface.

When you start thinking about how to break down subcomponents, take a step back and think about the interfaces between components, and do your best to sanitize them – remove anything that might bite you if implementation details change.

In the case of the Endpoint, that might mean writing an interface that takes a dictionary of parameter names and values, with no special encoding, and providing for success and failure callbacks. A success callback could give you some generic string or binary representation of the data you requested (which can be passed to a parser/marshaller next). The arguments to the failure callback would be a generic error representation (most platforms have one), with an appropriate app-specific error code and message – not an HTTP status code, or anything else dependent on your data source.

DIP is a key way of limiting technical risk; in this example, after we have changed the interface to be generic with respect to the data source being called, a change to the Endpoint class requirements necessitates little or no corresponding change to the Service class, and vice versa.

The Obligatory Recap

Over these past five posts, I’ve covered five principles for building resilient object-oriented systems, with resiliency being defined as resistance to common classes errors, low cost of change, and high comprehensibility (i.e., well-managed complexity).

Here are all five once more, not with their canonical formulations (you could get that from the Wikipedia page on SOLID), but with my own distillation of the core lesson (IMHO) from each:

  • Single Responsibility Principle: Give each class one thing to do, and no more.
  • Open/Closed Principle: Extend components, rather than modifying them.
  • Liskov Substitution Principle: Stay aware of the promises your classes make, and don’t break those promises in subclasses.
  • Interface Segregation Principle: Classes should advertise capabilities discretely and generically.
  • Dependency Inversion Principle: Details should depend on abstractions, never the other way around.

Go forth, and write some awesome software.

Next, I’ll post something lighter and non-technical. Promise.

September Writing Challenge, Post 17: The I in SOLID

Note: This is the fourth of five posts I’m writing on the SOLID principles of object-oriented programming. Part 1: S, Part 2: O, Part 3: L

The Interface Segregation Principle is probably the easiest of the five SOLID principles for most programmers to grasp, if for no other reason than they’ve been exposed to it constantly if they’ve been working with an object-oriented language. The ISP says that small, single-purpose interfaces are to be preferred to large, omnibus interfaces.

Finding examples is easy. Here are a few lines pulled from the Cocoa Foundation headers:

For those of you who aren’t fluent in Objective-C: In each of those lines, the identifier immediately before the colon is the name of a class or protocol being declared (as distinct from being defined), an identifier immediately after the colon but outside the angle brackets is the parent class of the class being defined, and identifiers inside the angle brackets are protocols to which the class or protocol being defined will conform.

If you’re a native Java speaker, @protocol is very similar to interface; if C++ is your thing, @protocol is akin to a pure abstract base class. In all three cases, it’s all contract and no implementation.

What contracts are these interfaces expressing?

  • NSCopying exists “for providing functional copies of an object.”
  • NSMutableCopying is “for providing mutable copies of an object.”
  • NSSecureCoding offers all the NSCoding methods for archiving an object, and additionally allows an object to assert that it unarchives securely.
  • NSFastEnumeration is “implemented by objects wishing to make use of a fast and safe enumeration style.”

…and of course, each class has the methods that make it special: an NSArray has the the operations you’d expect for an ordered, randomly-accessible collection of objects; NSString allows you to search for substrings, and so on.

Each interface defines a very specific capability – you could almost call them atoms of functionality (or promised functionality).

So why do we break up our object declarations into these separate interfaces?

First, it offers you a certain amount of protection. NSCoder (non-Cocoa heads: it archives objects complying with the NSCoding protocol) only needs to know about those methods relating to object serialization. Someone writing an NSCoder subclass doesn’t know and doesn’t need to know about copying or enumeration or any of the other things Foundation objects commonly do, and therefore can’t do anything surprising to an object that is passed into that subclass (like mutate it unexpectedly via a method having nothing to do with archiving). It allows you to expose only those methods a particular caller should care about, and in that way avoid surprises.

Second, it allows you more freedom in how you express the capabilities of a class. Imagine modeling a bird in Objective-C:

This looks straightforward, but what about subclasses that don’t need all of those capabilities? Should Ostrich or Penguin throw an exception when you call -fly? Should it be a no-op? What is it reasonable for calling code to expect? You could make Bird a protocol instead of a base class, and make flying-related operations optional:

…but then what do you do when it comes time to model a Bat? Flying is a very similar operation, but all the code you wrote that needs -fly is expecting a Bird. You don’t want to duplicate the same code for a Bat, and you certainly don’t want to start checking types and casting, because you’re eventually going to have to implement FlyingSquirrel, and FlyingFish, and who knows what else, and that code will turn into an error-prone hairball. If the -fly operation is used in the same way on each class, the calling code shouldn’t care about the specific type, only whether -fly is implemented.

With interface segregation, we can declare all of these things very flexibly:

Behavior that is shared across class hierarchies is broken out into a special-purpose interface. A method to check the altitude of a flying animal doesn’t need to know whether it’s a flying bird or a bat; the method signature - (NSFloat)checkAltitude:(id<Flying>)flyingAnimal; makes it clear that this code cares only about flying animals. You can’t even pass a Penguin to this method. (Another note for the non-ObjC-ers: id<Flying> means any object that conforms to the Flying protocol.)

Going back to the Foundation classes I referenced at the beginning of this post: It might be tempting to say that most of the classes need most of the same functionality, so why not put all the copying, archiving, and enumeration methods on NSObject, or make a subclass or protocol called NSFoundationObject that offers all the relevant methods?

That would work fine for the collection classes, all of which implement all the interfaces. Then we get to NSString… What does it mean to enumerate a string? Our first naïve thought might be to treat the string as a collection of characters, but nothing in NSCopying says anything about a character encoding, so that’s not going to work. Someday, someone is going to try to enumerate a string, and it will… crash? Throw an exception? Behave like an empty collection? Doing nothing isn’t even an option, because the lone method on NSFastEnumeration has a return value. (And the answer is not to have NSCopying‘s method take a character encoding enum, and have classes that don’t need it ignore it – that’s making the problem worse, not better.)

It gets even sillier with NSNumber. What does it mean to enumerate over an atomic value type? What does it mean to have a mutable copy of it? It would be senseless for NSNumber to claim that it offers these capabilities.

So, it doesn’t. Every type advertises only those capabilities that are meaningful to it, with interfaces that describe those capabilities minimally and generically.

And for those of you diving into Swift, this mode of thinking is a precursor of the new hotness, Protocol-Oriented Programming.

Come back tomorrow for the thrilling conclusion of the SOLID series: the Dependency Inversion Principle.

September Writing Challenge, Post 16: The L in SOLID

Note: This is the third of five posts I’m writing on the SOLID principles of object-oriented programming. Part 1: S, Part 2: O

The Liskov Substitution Principle is probably the deepest and most “academic” of the SOLID principles of object oriented design. It states that replacing an object with an instance of a subtype of that object should not alter the correctness of the program.

If you read the Wikipedia page for the Liskov Substitution Principle, you’ll see that there is a whole lot packed into that word “correctness”. It touches on programming by contract, const correctness, and a lot of terms that will have limited meaning to people who don’t have a degree in computer science or mathematics. It can also be difficult to see how some of the more academic-sounding constraints apply to the real-world systems that we write. I’m going to try to back into it with a couple of “ferinstances” that motivate a more practically applicable (if slightly less rigorous) formulation of the LSP.

The “classic” LSP example is the square/rectangle problem. It’s natural for us to think of a square as a “specialization” of a rectangle; if you say, “a square is just like a rectangle except that all its sides are of equal length”, most people won’t object.

When you try to bring this abstraction to an object design, however, things break down. Let’s lay out this object hierarchy in Swift – where I had to jump through a couple of hoops to get the square’s constraint to work as needed:

Calling code that expected a rectangle to have its height and width vary independently, or code that had expectations about any derived quantity (like area or the position of a vertex) is at risk for being broken now.

What’s the general principle we can draw from this? It might help to restate the square/rectangle relationship: “A square satisfies all of the constraints of a rectangle, and adds the constraint that its sides must be of equal length.” For the operation of setting width, the Rectangle allowed us to expect that its height would be invariant. The Square breaks that expectation – because of its extra constraint, its property setters mutate state that the parent class’s setters don’t touch. This is part of what it means in that Wikipedia article when it says that “Invariants of the supertype must be preserved in a subtype.”

There are other kinds of constraints that break expectations of calling code. You might be writing an object in a payroll system that has a method to compute compensation, and it might have a method signature like Currency computeCompensation(Employee emp, Timesheet latestTimesheet). That’s a very specific contract made with the calling code, and a subclass may not add a constraint by, for example, demanding that emp must be of the subclass OvertimeEligibleEmployee. Calling code has the reasonable expectation that it may pass in any Employee object or any instance of a subclass of Employee, and further constraining the type of emp breaks that expectation – so badly, in fact, that every OO language that I’ve worked in (which isn’t all of them, by any means, but it’s a fair sample of the common ones) disallows changes to overridden method signatures. You could get around it in the child class’s overridden method by downcasting to OvertimeEligibleEmployee. If you’ve ever been warned against downcasting, this is exactly why – you’re basically saying, “the caller says this is an instance of Employee, but I know better”, and sometimes you’ll be right, but at some point you’re going to be wrong about that and introduce a crash or a hard-to-trace logic error.

This, to me, is the core of the Liskov Substitution Principle: it’s all about constraints and expectations. If your child class introduces a constraint that would break any plausible expectation of the code calling an instance of the parent class, you’re breaking the LSP, and you may or may not be breaking your program.

The LSP is the most restrictive of the five SOLID principles and the easiest to break, either unintentionally, or because you decided that a downcast or an extra property mutation in a child class is okay just this one time. And the LSP gets broken all the time in production code and even well-regarded framework code, sometimes productively. For you Cocoa heads: You’ve seen mutable subtypes of immutable types – NSMutableArray is a subtype of NSArray, NSMutableString is a subtype of NSString… How does that stack up against the “history constraint” cited in the Wikipedia article? Bonus question (that might lead you to drink): How would you change this hierarchy of types to “fix” that?

I encourage you to do some reading on it, and to develop a feel for the innocuous-seeming changes in the lower reaches of your class hierarchies that might break expectations of code written against your parent classes – and likewise for the times when you can profitably but deliberately break the LSP to get things done.

September Writing Challenge, Post 15: The O in SOLID

Note: This is the second of five posts I’m writing on the SOLID principles of object-oriented programming. Part 1: S

The Open/Closed Principle says that software components should be open for extension, but closed to modification.

Sometimes you will see this described in terms of inheritance. Drawing from the examples in my post yesterday, take the case of a retail product that has multiple possible JSON representations exposed by a web API. The product detail API call will give you everything you need to fill in a product information page. Looking up order history, though, you’ll have some part of that same information, plus a quantity, an order price (which may be different than the current purchase price), &c. For the sake of argument, let’s also say you have a well-tested ProductMarshaler class that handles data received from the product detail endpoint. In Swift, you might have something like (assuming the JSON has been parsed into a dictionary):

But what about the order API response, which needs something extra? We could go and alter the existing ProductMarshaler… but that feels uncomfortably close to a violation of the Single Responsibility Principle. It also feels icky to muck around in tested, working code. But we don’t want to duplicate everything from the existing marshaler either, because duplication is bad, m’kay?

We don’t need to alter the stable, working portion of the system in this case, we just need to extend it. In OO languages, one way to do that is with inheritance:

We have closed off a stable part of the system to changes, but taken advantage of the fact that that part is open for extension. We’ve taken advantage of common behavior while specifying only the extra bits we need for our special case.

Inheritance isn’t right for every case, though. What if you’re working on a payroll system that requires you to draw employee data from both a sexy, new JSON-based API and an old and busted XML-based API? Inheritance doesn’t buy us much here – there’s no obvious, natural way to draw common functionality out of the tasks of extracting data from these two formats… But the one thing they do have in common is that you’re passing in a string representation and getting back an Employee model object. Most modern OO languages allow you some way to define an interface – that is, a sort of contract for your object’s behavior – without specifying an implementation. In Java, that might look like:

And your classes would look like:

…and similarly for the JSON marshaller.

In both cases, when we say the class implements an interface, we’re forcing it to commit to a contract: “I’m going to take in a string representation and return an Employee object and send you errors out-of-band.” The interface is the part that we have closed to modification; we’re saying that all our employee marshallers must commit to this contract. Implementation is left open; the newer JSON implementation (which, being new, might still be seeing changes to its requirements) can change freely without impacting the legacy XML implementation.

If you read articles about SOLID and Open/Closed on the web, you’ll see both interpretations of the principle, inheritance-based and interface-based (with the interface-based interpretation being newer and hipper). Both are useful. There may be other interpretations, as our thinking and our computing languages evolve. The most profitable way to think about the Open/Closed Principle is to ask yourself: What are the parts of the system that are likely to change very slowly, or not at all? Whether those are interfaces or implementations, those are the things that you should figure out early, and close to modification. These stable, closed parts of your system should be the coupling points between the “open” parts of your system. With interfaces especially, those closed portions of your object design can be used to segregate regions of change and technical risk, so that they remain tractable in the face of requirements changes and debugging.

And speaking of interfaces, they’ll figure heavily into tomorrow’s post on the Liskov Substitution Principle.

September Writing Challenge, Post 14: The S in SOLID

Note: This is the first of five posts I’m writing on the SOLID principles of object-oriented programming, and the fourteenth of 30 posts that I’m writing in September.

If you write object-oriented code, but have never heard the phrase “SOLID principles”, we should talk.

SOLID is a mnemonic for these five things that you should pay attention to when doing object design:

  • Single Responsibility Principle
  • Open/Closed Principle
  • Liskov Substitution Principle
  • Interface Segregation Principle
  • Dependency Inversion Principle

Come back over the course of the week to see the last four, but today I’m going to ramble a little about the Single Responsibility Principle – or as I sometimes call it, the Small, Dumb Object Principle.

In code calling web services, there’s a pattern I commonly see: Someone will write a class that takes some query parameters, calls the network endpoint, pulls out the JSON (or an error) from the response, interprets the JSON into a model object, and returns it to the user (usually asynchronously, via a block).

I think that’s a fine interface to show to calling code – request params in, model objects out – but I don’t think much of it as an object design. There’s too much going on in that one class. I would break it up like so:

  • There should be an Endpoint class that does nothing but turn the parameters into an HTTP request, and blindly report the result (successful JSON response or an error state).
  • There should be a Marshaller class that does nothing but consume JSON and spit out model objects.
  • There should be a Service class that coordinates the other classes, and perhaps does some manipulation/saving of the models or interpretation of the error state before reporting them to the user.

To some people, that seems like overkill – three classes when you could have one. I’d say they’re not thinking deeply enough about software requirements and how they tend to change.

The Single Responsibility Principle states that a class should have only one responsibility. Put another way, there should only be one category of system requirement that forces each class to change.

What happens when your web service gets versioned? What if you have to handle multiple versions of the API in your code? If you’ve coupled your JSON marshalling code tightly to the output of your network call, it’s going to be a pain to tease that out. It’s wiser to separate your concerns, keep the network code dumb about how its output will be used, and be able to swap in the right Endpoint object as needed.

Keeping the Marshaller separate has benefits, too. What if you’re working on a dating app that has identical user objects come down as part of login, search results, messages, and a local newsfeed? You’ll want that marshalling code packaged reusably, decoupled from any one API call, but accessible to them all.

Another problematic implementation I see is packaging the marshalling code inside the model class – why shouldn’t the model object create itself? You can have one less object, and it doesn’t break the reusability case I described above.

What if you’re working on a retail app that has different product representations come down from different API calls? The product detail call is going to have a lot more information than the search results, and will also look different from products in a shopping cart (which will at least have a quantity added), or the products in order history (which will have a quantity, an order price separate from the current price reported by the detail call, possibly a link to the order…). If you try to handle all these cases in the model object, three quarters of your model code is going to be devoted to translating these different JSON representations into model instances. Does that make sense? Does that match with the “data and business logic” description of the M in MVC? On the other hand, a small class hierarchy of Marshallers can handle this cleanly while keeping the model class ignorant of the details of its own serialized representation.

Because the Endpoint and Marshaller objects are ignorant of each other, they’ll need help to get anything done. So I write a Service object that passes parameters to an Endpoint, passes its output to the Marshaller, and returns the results to the user.

Is the Endpoint pulling from an HTTP web service? A SQL data store? An interface to a fax line? The Service doesn’t care; it just knows that the Endpoint takes parameters and asynchronously spits out data and errors. Is that data in JSON? XML? Packed binary format? Cuneiform? The Service doesn’t care; that’s the Marshaller’s problem. It just knows to feed the Marshaller data and collect the model objects that fall out – at most, it has to choose the right Marshaller based on a data type or other out-of-band signal from the Endpoint.

Because each class is small and limited in scope, such classes will tend to be very testable in isolation – which is great if you’re doing TDD or any other automated testing practice. Because each class is small and limited in scope, it will be very easy to look at the code and reason about what the object is doing – and that makes your life so much easier when you’re debugging.

The most common complaint I’ve heard about this way of doing things is that it leads to a proliferation of small classes and class files, and I have admit, that leaves me a little baffled about the logic behind the objection. Reducing the number of files in a project is not a reason to do anything. Is business value somehow inversely proportional to the number of files in a project? Does such an objection refute any of the practical points about reusability, complexity, or coupling that the Single Responsibility Principle addresses? As Edward Tufte has famously said: “There is no such thing as information overload. Only bad design.” I’d extend that to say that there is no such thing as too many class files in a project – there are only poorly organized projects.

Small, dumb classes with standard interfaces are the building blocks of stable systems. This is one of the key ways we manage and mitigate complexity in object-oriented programming.

Tomorrow, I’ll share some thoughts about the Open/Closed Principle, which is one of the ways we manage change in OOP.

September Writing Challenge Post 12: TIL, Radix Sort Edition

It occurred to me over coffee this morning that I’d never implemented my own radix sort; so, while the caffeine kicked in, I did that, and I learned a few things.

(Everything below applies only to my radix sort implemented in Objective-C in Xcode 6.4 on a machine with an Intel Core i7 processor, sorting a list of a million pseudorandom positive integers with values in the interval [0, 108). YMMV.)

Once I got the sort implemented and performance tests set up, the first thing I did was to parametrize it by radix. When you see a radix sort demonstrated in books, you always see it done in base 10. Unless the numbers were represented as strings, that always just seemed weird and unnatural to me, so I experimented with different radices. For my less-than-rigorous test, a base 2 sort took about 3 times longer than a base 10 sort, which was slightly slower than a base 16 sort. Base 100 was faster still.

This makes sense: Radix sort time complexity is O(kN), with N as the number of elements to sort and k as the maximum number of digits in the radix you’ve chosen. Radix goes up, k goes down. Gist and code:

With that finding, I wondered how much better you could do choosing a power-of-2 radix, and doing a little bit bashing – Objective-C is still a superset of C, and things like the C bit-shift operators are native instructions on Intel processors, so I figured I might be able to squeeze a little more performance out of my sort by eschewing % and / in favor of >> and &. I was correct – I can get almost a 2x improvement using bit manipulation over arithmetic operations in the base 2 and base 16 sorts. Gist and code:

Notes:

  • I realize I am discovering nothing new here. These were just my own learnings from poking around over coffee.
  • It should not come as a surprise that bad things happen when you add negative numbers to your test set (at least with this implementation).
  • Moving to base 256 with signed int test data and the range of numbers I was working with led to more bad things happening.
  • Maybe I’ll have some more coffee.

September Writing Challenge, Post 8: What I Care About in a Technical Interview

If I’m interviewing you for a technical position, I’ll ask a bunch of questions. Most of them are warm-ups, or checking off a list of basic information I’m collecting. There are two questions that take up most of the interview time, and contribute the most to my interest (or lack thereof) in your candidacy.

First, I will ask you to describe the technical details of a recent project. I’ll get down to the object design level, and I’ll question decisions like your choice of data store. If you are articulate about the details of the software you’ve written, and you have reasons for the decisions you’ve made, that works in your favor. It also lets me poke at how well you know the SDK we’ll be using, but that’s secondary.

I also have a pet algorithms & data structures question (which I will not share here). I’ve asked it to a few dozen people, and maybe two have taken it to a solution that is optimized in time or memory, and only one pulled out multiple good solutions. There is no single correct solution, and you’re not expected to get there anyway. While I definitely use this question to probe CS fundamentals, that’s not the most important part of your answer – I’ve worked with some sharp people who didn’t know Big O Notation from Maxwell’s Equations. It’s much more important to me how creative you are in solving the problem, how many paths you see to go down, what hidden assumptions creep in, and what tools you reach for.

Comment fodder: Does anyone want to share favorite interview techniques? How about good or bad interview questions you’ve been asked?