Showing 7 posts
We have reached the end of the Production software series. Before moving onto other topics, let’s wrap it up as is customary by listing all the posts so far: Series introduction Be wary of assertions Constants will bite you Hide new features behind flags Logging Identifying your builds But this does not represent the end of this topic’s coverage in this blog. If you have suggestions for further posts in this same area, don’t hesitate to suggest them!
October 31, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
1 minute)
The scenario of the day: a binary you deployed to production two months ago has been running fine and dandy since then… until today, when the report of a strange and concerning crash arrived. Nothing really unusual: these things happen all the time and need to be dealt with. But you, as the proud developer of the software, attempt to reproduce the problem on your own system with the most recent sources and… cannot do so: the issue escapes all your tests (again, not unusual). Is the problem really gone or have recent changes to the source hidden the problem by modifying the triggering conditions? Either way you have to verify it because the second case is usually quite scary.
October 28, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
3 minutes)
Quoting the Wikipedia as is becoming customary: logging is the cutting, skidding, on-site processing, and loading of trees or logs onto trucks or skeleton cars. Huh, wait… I think I got that definition wrong. Let me try my own: Logging is the recording of application activity on-the-fly onto a reliable storage system. Having this functionality in your software is something you don’t think about until you need to debug an issue and, well, then it is too late. When your application crashes, especially in production, the more information you have access to, the better. A detailed record of the program activity that led to the crash is often incredibly insightful. (Hint: compare to flight data recorders.)
October 24, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
7 minutes)
Following on the topic of run-time changeability from the previous post, it’s now the time to talk about new features and the process of deploying them. Scenario: your developer team has been hacking non-stop for the last few months on a really anticipated feature. Such feature has been developed on a branch all along and never seen the light of production. Until now: the upcoming release of the software has the branch merged in and the shiny new code built into the binary ready to roll to production. You start deploying it and… guess what? Everything works just fine and users are grateful for the new functionality! Except… the new code triggers a serious data-corruption bug that you do not discover until you have rolled out the new release to 80% of your fleet.
October 21, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
3 minutes)
Early on in your career as a developer, you are told to never put constant values in the code and instead define those separately using a self-descriptive name. For example, instead of doing this: def modify_remote_file(...): contents = get_remote_file(server_url, file_name, timeout=60) ... put_remote_file(server_url, file_name, contents, timeout=60) You would do this: # Maximum amount of time a request on the file server can take. FILE_SERVER_TIMEOUT = 60 def modify_remote_file(...): contents = get_remote_file(server_url, file_name, timeout=FILE_SERVER_TIMEOUT) ... put_remote_file(server_url, file_name, contents, timeout=FILE_SERVER_TIMEOUT) This is obviously good advice so I’m not going to dispel it here: the hardcoded number in the first snippet was hard to keep track of, hard to keep consistent across calls and hard to tweak when necessary.
October 17, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
3 minutes)
As the Wikipedia puts it (emphasis mine): An assertion is a predicate (a true–false statement) placed in a program to indicate that the developer thinks that the predicate is always true at that place. If an assertion evaluates to false at run-time, an assertion failure results, which typically causes execution to abort. On the programmer’s side, assertions are invaluable in writing readable code: they provide a mechanism with which developers can explicitly state their thoughts and expectations in the form of code instead of comments.
October 14, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
3 minutes)
Hello again! After a pretty busy week at EuroBSDCon 2013 and a week of vacations right after—which made me miss the two posts due last Thursday and Monday—it is time to start a new series titled “Production software”. Originally, I had intended this series to be called “Production engineering”, but the definition that the Wikipedia gives for such term does not really convey the contents of the upcoming posts. So what will we go through?
October 10, 2013
·
Tags:
<a href="/tags/production-software">production-software</a>
Continue reading (about
1 minute)