These four Devops practices are not enough
You know the moment you hit with “DePloy” at 4:59 pm on Friday, and after two, your phone explodes with alerts that production is burning? Like literally on the fire!
Or when a big colleague, Dave, repair PROD quietly, but It does not mention how– Perhaps because they fear that if artificial intelligence learns their secrets, this will completely automate their job? (I am just half, of course.)
Welcome to the caffeine rolling vessel that we call Devops. We all heard the standard advice – “Automation of your structures! Constantly! “-But sometimes the best practices that have been discovered are less than.” Do you work here? To the status of the company’s hero.
This is what we will focus on here-the least well-known Devops safety networks that no one talks about-and you will not be some jokes along the way because if you are not able to laugh at 3 am, what can you laugh at?
What is Devops? For a million
In essence, Devops gets the development team simply to communicate with the IT team, which occurs when it creates an environment and a work that naturally turns the two teams to the best they cannot live without each other.
Note that I am using the word “team” for both? This is because the purpose of Devops is not the creation of one team but to create a two -way bridge between both by providing an unrestricted communication culture and cooperation.
Therefore, basically two independent teams working as one (although they hate each other secretly). If it is well implemented, Devops has the ability to provide programs faster and more reliable, and with better quality by breaking the invisible walls between development and operation.
1. Embrace chaos through a human error (managed)
One line of the code can disrupt a complete environment when writing it by a person deprived of sleep (or, let’s be honest, that is, from us on a bad day). Instead of assuming that no one will slip at all, it is better to expect and integrate these risks into your strategy. This is the place where the concept of chaos comes.
In simple phrases, chaos engineering intentionally creates failures to test the elasticity of your systems. But let’s add a development: human chaos engineering. Netflix certainly made the chaos famous for automated tools to kill services randomly in production. However, realistic life systems rarely fail in accurately coordination methods that the machine may expect.
Most of the time, it is just a good developer for implementing an order on the wrong server. By giving teammates less experience or semi -eligible work colleagues in a safe test environment, you can see how your system is affected by unavoidable human errors. And let us be honest, if your infrastructure can be bounced from the sincere mistake committed by the trainee, it is likely to bear anything through it.
2. Documenting strange things
One of the best ways to avoid recurrence of production nightmares is to maintain a record for each strange defect and one -time repair. Production may collapse in midnight in UTC completely every year a jump, or there is a memory leakage caused by passing a specific parameter to an old road. Without the appropriate documents, the same problems will continue to happen and require new thinking to solve them.
Referring to these events is in one place-a document, a Wiki page, or even a sticky notes plate-something you will thank you in the future at 3 am) at a day.
This reminds me of the “Gen Z” trainee, who discovered a decisive security insect in an old audience, a dust, everyone forgot while leaving comments driven by joy at the base of the code.
In finding over the verification of severe health. And between “Jad”, I mean the type of mistakes that were lurking since Obama’s first state in office without anyone noticing this.
The funniest part is the title of Jira’s ticket: “Auth was MAD Sus RN No Cap Frr (a critical security issue).” Regardless of all jokes, this story presents the importance of documenting strangeness – not only the insects themselves, but also their context and the exact steps to fix the problem. Without notes about how the old authentication function was supposed to do (and most importantly, why was it written in this way), the developers never saw the giant red flag waving behind the comments that resemble the MS.
So, yes, the documents in Devops may be boring in theory, but it can reveal the biggest problems hiding behind the code that dates back to centuries and rapid repairs.
3. Keep the secret of secrets (yes, really)
Everyone knows not to commit credit papers for the source of control, right? right?! However, it still happens, and sometimes in amazing viral methods.
A good secret management system closes everything from the applications of the application programming interface to the database passwords, making sure not to see today’s light. Tools such as hashicorp vault or Aws Secrets Manager automatically automate the time of operation, so your valuable accreditation data never lives in the normal text code or exposed formation files (completely safe from these inexperienced trainees).
Less well -known trick: rotate your keys regularly. Like, he actually did it, not only in some virtual scenario that your safety team dreamed of. By scheduling the main routine courses – whether it is monthly, quarterly, or another time separation – reduces the risk of converting the unimaginable accreditation data into a huge security problem. And if you cannot turn the rotation, put it in its accumulation and mark it as a frequent monthly work. Because the unimaginable secrets will not end in the end of mind.
4. Monitor small things
Ask one of the beginners of the beginners from monitoring, and they are likely to get rid of the use of the CPU, RAM, and possibly disk space. A great start, but the aversion to the real collapse often depends on smaller and less bright numbers. For example, see your work list lengths. If these tasks begin to accumulate, they are a harbinger of the greatest slowdown in the future.
Watch the HTTP 4xx and 5xx case codes, especially the lesser 499 or 503 codes-and this can indicate the date, partial content problems or harm at the level of small services.
Cut the trees is another covering ally. Containers ’level records, DNS information, or random microscopic services registry flows as white noise so that there is something well. Spend some time to build alerts for these small standards, and disaster usually hunt before holding you.
5. Mechanical test data (opposite real data in the gradual)
See, get it. As a Devops engineer, sometimes you need “production -like” data for really strong tests. But not only talk about yourself in copying the real database to an attractive start as it might be. Especially if your real database is 47 terabytes of real user data that contains personal information. If you completely repeat the data, rub it or hide the identity correctly.
If you exceed this, you are risking a large leakage, or at least, risk your test environment in sending cross emails for real users (“Congratulations on buying the new Tesla from the gradual!”). A better approach is to create strong fake data enough to reflect your production patterns. You can use the tools for this, or you can take advantage of artificial intelligence for almost perfect fake data.
Final ideas
Before your Devops’s success is not only the next one for luxury pipelines or the latest container synchronization tool. It comes to a spirit of cooperation, continuous improvement, and yes, a little organized chaos.
These are the best detected practices-to be unleashed for a semi-sleeping chaotic engineer to create strange magazines-layers of flexibility in your delivery process. They may save your bacon the next time that you are one thing away from Nuking production. When you are in doubt, remember the basic principle: anything that can likely make a mistake – so you may plan that, and laugh at it together.
After all, Devops revolves around the team as much as the symbol. And if you or someone spoils production and breaks production, there is always a shame for a good story and the fame of the hero who elevates things. Oh, and if you break the production of production … call me. I will bring cakes.