With the rise of automation, and an increase in the use of “infrastructure as code”, IT professionals responsible for systems management are increasingly exposed to new demands, and new risks. It’s therefore worth going back to software development fundamentals, and consider the lessons that are applicable in this new world; even if you aren’t doing full continuous integration/continuous deployment, there are plenty of basic rules that are still applicable.
It is absolutely critical that your company have multiple environments that developers can use to modify, break, and extend code. No code is ever perfect first time; if changes are made on the fly to production, it’s a question of when, not if, you’ll suffer serious downtime as a consequence.
Usually, code is developed in multiple stages. First is the developer’s own sandbox, where everything that happens is isolated from the rest of the world. This is useful for testing ideas and thoughts about how the code could work. Mistakes, dead ends, and unforeseen consequences thrive here – that is, after all, the whole point of the sandbox. Try something; if it doesn’t work, move on to the next idea.
Second is a shared development environment, where the code everybody is working on comes together. By the time code reaches this stage, the glaringly obvious problems will have been resolved, but there will still be the possibility of other, slightly-less-obvious problems coming up due to interaction with other new code.
The third stage is the test environment, where more formal testing is carried out, to verify that everything works as expected before pushing up to acceptance, and then to production.
Best practices have these environments built and configured identically to production (or, if configuration needs to change because of changes to the code, those configuration changes should also be propagated to production once testing is complete). Failure to do this may mean that problems will arise in testing that would not in production, or – worse – that problems that will arise in production won’t be exposed during testing.
The biggest difficulty is building up a solid set of test data; extracting data from production for this purpose could expose the company to privacy and security impacts that may not be immediately obvious. But at the same time, there will be occasions when problems arise that can only easily be reproduced with such an extract.
Version control is also known as revision control, or source control. In a simplistic form, it means keeping a copy of every different version of the code that was ever written or deployed. Generally speaking, when using version control, developers will write code on their own system, bringing it up to an appropriate level of quality before committing it to a central repository. If the changes introduce more problems to an unacceptable level, the committed code can be rolled back to the earlier version quickly and easily.
A good quality version control system will allow multiple developers to work on the code base simultaneously, flagging conflicting changes when (or just before) they are committed. Even better is if the system allows atomic commits – where a commit to the repository either completely succeeds, or completely fails. This is strongly desirable when changes to multiple files are required in order to complete a particular change.
The de-facto standard in the Linux community for version control is git. Git differs from expectations in a number of ways: there is no central repository that holds a “single source of truth”, except by convention. Each developer will have a full copy of the entire repository; changes made by a given developer are checked into their own repository, and only become visible to others when pushed to other repositories. Typically, a company will have a server that holds the “main” git repository, deemed to be the source of truth; developers will clone that repository, updating from it as required, and pushing updates to it once complete.
If your company doesn’t yet have version control software, standardising on git is a reasonable choice. If it does have version control software, it’s arguably better to stay on the package already in use and take advantage of the institutional knowledge, than to switch to something else and have to face that learning curve. The exception is if there is a significant improvement to be had in the switch to the new package; for example, companies using Visual Source Safe should look to move to something different, since that package is no longer being maintained. Similarly, CVS is being maintained with bug fixes only; there is a strong case to be made for switching to something that is being actively maintained and developed. Conversely, for other packages, such as Subversion, the case for switching is not as clear cut, and the argument for staying with a known quantity is reasonable.
In the context of “infrastructure as code”, everything that can be put into version control, should be – even application and system configuration files. The storage space necessary to store this information is cheap; the cost in human time to figure out how to fix problems caused by changes is high. The ability to roll back quickly, and then figure out what went wrong without having pressure from the downtime, is invaluable.
Testing code is good. Automatically testing code is better: it helps to ensure that tests aren’t overlooked, or their results ignored. Early on, this will be difficult, as it’s not always obvious what should be tested, and what cases would be just pointless noise. But a natural and essential part of software development is testing the newly written code; these tests can be adapted to become a part of the test suite. Regression tests – making sure that old faults don’t reappear – are also good to have. Over time, done properly, the test cases should grow to cover the entirety of the code base.
The more you automate your testing, the less likely it will be that you’ll forget to run a given test as part of the development process. The more tests that you have, the less likely it will be that a given problem will slip through the cracks. Most languages will have a framework that you can exploit to build up your automated testing regime – Ruby has rspec; Perl, Test::Class, Test::Unit, and others; Python, py.test, unittest, Green, and others … in short, it doesn’t matter which language you use, the odds are extremely high that there will be one or more good frameworks that can be used. Pick one, and make it part of the development workflow; your code – and your environment – will be better for it.
All of this is very simple, basic stuff that you should have in place already. But it never hurts to be reminded, and to check that the development system that should be mimicking production does, in fact, mimic production (rather than having decayed and drifted away over time.)