Friday, August 28, 2009

Software Development Overview

All software, outside of the most trivial examples, handles data (some form of digital representation). With that data, software basically does four things:
  1. Capture
  2. Process
  3. Store
  4. Output
In each case what the data is and how it is handled differs, but really that is the crux of software development. Leave out the hardware on which the software runs. These four items are it. That said, each one of them can be drilled down into very deeply, with each layer yielding more approaches, ideas, philosophies, etc.

Methodologies on how to create each of these pieces and make them work together compose volumes. Whole companies exist to serve small parts of each of the approaches.

That said, it is still just as simple as these four things.

Data has be be captured from the environment outside of the computer program, brought in from a keyboard being typed on by a human, from a file on a disk being read, from a camera storing an image or one of many other approaches. There are a multitude of ways for capture to occur. There are fierce debates about the best way to do so.

Processing and storage can be done in either order, but I am putting processing before storage because normally something happens to the data before it is stored for future use. All the clever algorithms for sorting or determining values that take up so much time on the discussion boards fall into this bin. Again, debates are fierce about approaches to processing.

Storage can be done in various ways, being slightly less contentious. Physically the data winds up in the memory of the computer, either volatile (in RAM) or non-volatile (a storage medium like a disc or flash memory). The way the data is stored, the format of it, can yield some debate but not nearly as much as where it should go and the method for retrieving it.

Output is communicating the processed and/or stored data to another system, be it a human or another computer. For humans this can be something like a monitor, speakers or a printer. Something for one of our senses to experience. For another computer (or the same computer, running a different program) it can be bits transmitted over a network or stored onto a disc (somewhat overlapping with the storage aspect).

As I said earlier, each of these items can be expanded on to very great extents but it is instructive to be able to come back to them whenever you start to get lost in the chaos of information that surrounds all human enterprise and by extension the computer software that supports it. Most often each of the pieces can be separated from each other and you can make decisions based on what works best for you for each one instead of having to put all of your eggs into one basket for all four items.

Philosophy surrounding the development of software to do these four things run the gamut from just sitting down and starting to write code to huge processes that involve many people who think for a very long time about what they want to accomplish and how they want to do it and how they will measure their efforts, etc. There are fervent proponents of each approach who tout the benefits of their favorite and will stick by it dogmatically without ever thinking that there might be another way to do things.

The techgnostic approach is to look at the problem you are trying to solve and then line it up with the four items that software development requires. Very often you can crank something out quickly that gets the immediate need addressed but will come back to haunt you later. You can also spend a huge amount of time trying to get everything perfect the first time. Try to shoot for something between the two, subject to constraints like safety.

If you are building software to control a medical device or drop a bomb you need to think long and hard about it and the solution will tend towards the more restrictive process. If it is a program to dump some data periodically for someone so they can run a trending analysis it requires very little oversight.

If you look around yourself, at the software that you use (like the browser you are reading this on) you can quickly discern where the four pieces are that make it work. Did the entity that produced the software make all the pieces depend tightly on each other, or are they interchangeable? There are probably other options for what you are trying to do that approach the problem domain from a different perspective. Do not get locked into only one way.

I will return to this subject over time because a lot of philosophy about how to get things done is buried in it. Cowboy programming versus gold plating. Architecture versus expedience. The topics are legion. To complicate things further there are significant issues associated with the process that transcend these four items to do with the management of people and the economics of the entire process.

No comments: