Avoiding splitbrain in a heartbeat/drbd setup

August 26th, 2009

What comes now is the description of a hack I did to avoid the occurrence of splitbrain in a 2 node linux cluster running heartbeat and drbd for disk replication.

I am not going to detail how to setup heartbeat and drbd and will assume that you are already familiar enough with this stuff.

To the matter at heart: in some circumstances, in a standard heartbeat/drdb setup, there still remains some situations that will result in a splitbrain.

Let's take an example: two nodes, N0 and N1. N0 is primary, N1 is secondary. Both have redundant heartbeat links and at least one dedicated drbd replication link. Let's consider the (highly) hypothetical case when the drbd link goes down, soon followed by a power outage for N0. What will happen in a standard heartbeat/drbd setup is that when the drbd link goes down, the drbd daemon will set the local ressources on both nodes in state 'cs:WFConnection' (Waiting For Connection) and mark the peer data as outdated. Then when N0 disappears due to the power outage, heartbeat on N1 will takeover ressources and become the primary node.

Here is the glitch: N0 may have made changes on its local disk between the time the drbd link went down and the power outage. These changes were not replicated on N1. And now N1 is running as primary with outdated data.

Not Good.

What we may want is to forbid a node to become primary in case its drbd resources are not in a connected and up-to-date state. This would avoid most cases of data corruption but also implies longer downtime.

As far as I can tell, there is no configuration parameter to do that in heartbeat/drbd. But we can work around that.

Upon trying to become primary, heartbeat starts its resources listed in the file /etc/ha.d/haresources. In a heartbeat/drbd setup, one of those resources is drbddisk which is just a script located in /etc/ha.d/resources.d/. If this script exits with an error code, heartbeat will give up trying to takeover resources in the cluster. Beware that this might lead you to situations where both nodes are secondary.

Here is are a few lines to add to drbddisk in order to block takeover when the local resource is not in a safe state:

case "$CMD" in
   start)
     # forbid to become primary if ressource is not clean
     DRBDSTATEOK=`cat /proc/drbd | grep ' cs:Connected ' | grep ' ds:UpToDate/' | wc -l`
     if [ $DRBDSTATEOK -ne 1 ]; then
       echo >&2 "drbd is not in Connected/UpToDate state. refusing to start resource"
       exit 1
     fi

NOTE: this patch works only if you have one and only one drbd resource.
WARNING: do not modify those scripts if you don't exactly know what you are doing...

This done, there are a few more things you may need:
- if you are using heartbeat in combination with ipfail, you might want drbddisk to forbid the node to become primary if it can't ping a given host (for example the gateway). That could look like:

PINGCOUNT=`ping -c 4 $PINGHOST | grep -i "destination host unreachable" | wc -l`
if [ $PINGCOUNT == 4 ]; then
   # we lost all 4 packets. the network is down and the other node
   # might still be primary. don't come up.
   echo >&2 "cannot ping $PINGHOST. refusing to start resource"
   exit 1
fi

- if you are using a stonith device, you may want to modify the stonith script to forbid stonithing the peer if the local resources are not in connected/up-to-date state. There might indeed be a chance that the peer node still is functional while the local node definitely is not.

Born to be root

August 21st, 2009

I like my life right now.
I recently changed job and started working for a young IT firm, as a software developer. But before developing some code you need servers to run it on. So for the past two months I have become a sysadmin.

BORN TO BE ROOOOOT. Yeh yeh yeh.

It's like being a kid in a candy shop. I get plenty of monster servers to play with, setting them up in high-availability clusters, reading tons of doc, playing around with heartbeat, drbd, nagios, system setups and the like. And once I am done plugging everything in the server hall, I size a huge hammer with spikes and I turn violent on them. I draw plugs, cut the power, unplug wires, remove hard drives and power supplies out of living servers. And I watch their struggle for survival. I split their brains and watch how they fight for control. That's really a lot of fun. And extremely instructive for the developer I am.

So very soon I'll have a 21st century clustered network to run my code on. How cool is that!

From good old batch systems to actor based parallelism

March 27th, 2009

I was recently reading on JavaWorld a serie of 2 articles discussing the advantages of actor-based parallelism over traditional approaches involving shared objects and locking mechanisms. Those articles can be found here: part1, part2.

The first article gives a very good introduction of how actor-based parallelism works so I won't spoil the web with a worse introduction of my own brew. Let's just remind us the key properties of the actor model: light-weight threads running in parallel, sharing nothing, but communicating with each other by sending each other immutable messages. Each actor has an input queue of messages to process, called an inbox. That's it. No shared memory, no deadlock-prone and brain hammering IPC mechanisms.

Two things struck me with the actor model.

It's beautiful. The actor model works and is dead simple to use. It only requires to think differently. To me, that is a clear sign of a beautiful design.

The other thing that stroke me is the similarity between some designs used in older batch systems and the actor model.

But let me tell you first a bit of my own history: I have been working quite many years now with a large financial system that was designed and built as a batch system. It is made of a constellation of small and relatively simple programs that communicate with each other by dropping files into each other's inbox. An inbox is just a directory with a specific location. All those programs run one after the other in a specific order and according to a daily schedule. And this schedule is repeated day after day. What we get in the end is like a big state machine that slowly but surely shuffles all of its tickets through various business flows.

This way of designing batch systems is sometimes called 'the inbox model'. It is quite standard and has been used for a long time. And it works very well.

What stroke me is how close this design is to the actor model. Instead of light-weight threads we have stand-alone processes. Instead of messages we have files, but those files are immutable in the same way as the messages passed between actors: they are not modified between the emitter and the receiver. Those stand-alone programs have inboxes very alike those of actors. In the end, the main difference is that the stand-alone processes in a batch system run sequentially while actors run in parallel. Which is where you may think:

"Wait a minute! If the processes in a batch system already implement the same message passing mechanism as actors, why couldn't they be run in parallel?"

And the answer is: they can. Assuming no other information passing mechanism is in the way, you could take such a batch system and make it run in parallel with relatively little modification. This insight gave me a feeling of awe and respect toward the inventors of the inbox model.

Of course, real life is not that simple. Most of the time, the processes of a batch system also communicate with each other via some database. In a way, a shared database realm is just like a shared memory and we are therefore back in the headache of shared-state parallelism. Too bad.

The source holds the truth

November 4th, 2008

The source holds the truth.

That's it, really. In the developer's world, there is no need to add anything else. No need for metaphor-filled rhetorics, no need for demagogy or opinion surveys, no "politics". Your code is either right or wrong. It's slower or faster than this other implementation. It contains that bug or not. Everything can be proved.

What a relief to be able at any time to seek refuge in this universe of selfless logic, to run away from petty personal fights and let your soul rest in the clear knowledge of your own skills and limits.

Everything is impermanent

April 6th, 2008

I had a shock yesterday when I realized that I could not remember the name of the software I was working on 5 years ago.

5 years ago I had been working for 3 years on developing a software platform for real-time event monitoring. I remember spending a wonderful time working on specific parts of that system, I can even recall specific days based on what kind of code I was writing then. I used to be genuinely engaged into this software. I probably stood numerous times under the shower thinking of particular problems to solve. I would walk through the town munching on juicy refactorings. During those years, this software took an important place in my life.

And now I can't even remember the name of it.

Will it be the same in 5 years with the software I am currently working on, the software I am living with inside my head and been dedicated to for the last 4 years?

Developer motivation versus team growth

April 4th, 2008

It seems hard to keep the motivation and engagement of a team of programmers at its top when their software's growth requires the team to expand. There is a fine line to walk there, along which striving to preserve individual creativity might be a key to maintaining balance.

Most software products grow out of the joint effort of a small number of developers, usually something like 3 to 5 persons. A small group like that can be a fantastic catalyst for individual energy. Everyone gets space to exert his or her own creativity. The product being only at its birth stage, team members are there for fun rather than for profit. There is a positive competition between team members since individual energy can focus on the challenge itself rather than on the hierarchical and managerial issues around it. This list could go on for a while. What I am trying to tell is that in my experience, the most creative and exciting development teams to work in are small ones.

That's an idea emphasized in agile methods and scrum who heavily believe in strongly focused, self organized and SMALL teams of developers.

In 'Peopleware', such teams are called 'jelled teams'.

When the number of developers in a team reaches a critical limit, somewhere around 5, the dynamic of the team changes. Roles are introduced, such as architect, lead-developer, CTO, boss. Decision making gathers within some roles and escapes others. An unhealthy competition grows when people strive to get those roles. People get bossed around. Developers feel they have no space left for their creativity. The fun factor tumbles.

Of course, team scale is related to product growth. It's because the software is growing that you need more developers. But if you insist on keeping them in the same team, you will get over the critical mass for jelled teams and loose the positive creativity in the team. I guess a way to avoid that would be to break overgrown software into smaller components and grow a jelled team around each of them. But it's easier said than done and can be hard to justify for management.

Team growth is not always directly related to code growth though. At places where management has taken over the engineers to decide over the product's future (which means at more or less every mature software company), management can cause teams to grow in completly artificial ways that are not respectful of the current balance of the jelled teams.

Let's take an example. A company has a jelled team of 5 that has been develloping a product over the last 5 years. The company also hosts a number of parallel teams working on related products. Management decides that the future of their product line requires a complete redesign of the product map, so they hire a number of software architects to decide on what to do. They hire those architects from outside the company because they worry that in-house developers promoted to the architect role would remain too loyal to their team and product and hence have a biased view on what to achieve. Those decisions make sense from their perspective, but out of mistrust for their most productive developers, this management has just given a death blow to its jelled teams.

Developers in jelled teams have given their souls to make their product a successful one. The very act of them doing so is usually what made the product successful, but that's something that non-technical management tends to be blind to. To maintain this level of engagement, you have to leave enough space for each developer to make design decisions and use his creativity. When you hire external architects, they will want to make design decisions. That's their job and that's what they enjoy doing. A conflict will occur between architects and developers on who gets to decide. Most likely, the architects will get the last word, because design is their role and the developers' role is implementation, at least in the management's simplified understanding of the craft of software development. Members of the jelled team will fill robbed of their freedom of creativity, and leave.

Management probably won't notice this chain of events. A few developers will leave. New ones will be hired. The new ones will do what the architects will tell them to do and everything will look good. But a process of natural selection has occured that will keep a certain kind of highly efficient and engaged developers away from this project. A few years later, the product will need a complete rewrite because no developer took on him to refactor the code, and no architect noticed the need.

All this boils down to the belief that successful software development is about getting the right developers. Some developers are just way more efficient than others. It has been proven in many ways and has grown into a widespread belief in the software community but it still remains somewhat of a taboo in many circles because it strongly implies that there is a kind of elit of software developers. It's unfortunate, because elitism has very little to do with what makes those developers be so productive. They usually do what they do for fun, seek no power and are little interested in career competition. To get those people, you don't even have to pay them much or to lure them with bright career plans. But you do have to offer them an environment in which they can create freely and together with people like them. Break this balance and they will leave.

E-factor

March 7th, 2008

Today, Peopleware taught me about the E-Factor.

The environmental factor is a simple measure: count the number of uninterupted hours you have at work during an average work week, and divide it by the number of hours of office presence. The idea behind this ratio is to give a measure of how much of your office time is spent actually producing. For a software developer, producing means writing code or thinking about it. Both activities require from the developer to be in a state that psychologists call flow, an almost meditative state of deep and active concentration.

I love flow. That's the reason I became a software developer.

And I have felt growingly frustrated about it during the past year. Let me tell you my story...

A number of things happened at work over the past 2 years: we switched place, and I got to sit just beside a heavily trafficked passage in the office. I sat with my back toward visitors so I never new whether someone standing behind me was just passing or trying to get my attention. I should mention that I and the other developers are sitting in a large open-space and are surrounded by groups of testers, managers and other development teams. I have at any time some 30 persons within hearing reach around me. In the middle of the floor lies the printer room (right in my back) and the dinning room (on the right side of my back) that people enter and leave in an almost uninterrupted stream. I have been working there for a few years now and I am sitting on quite much system knowledge, so people tend to visit me quite often to ask for insights and help. My software team is quite small and has a rather high turnover so I also have to be available to the new developers and support them when needed.

At the same time, I became the main developer on a truly complex project consisting in designing and implementing various rates to measure the growth and return of portfolios of funds. That was a tricky task, involving financial maths and an expert understanding of a complex piece of software for fond trading. In fact it was more of a research project, since no one knew how to actually implement those rates.

This project required flow. High quality flow.

Now, if you have read Peopleware or if you just have a bit of experience in software development, you probably already see where I am heading.

The project's deadline had been fixed ahead of time, before we even understood the true complexity of the task. Some of the rates were put in production upon the deadline and fundamental flaws in their very definition were quickly identified when customers started calling our support center after seeing crazy numbers on their account. From then on the project became a race to quickly re-think and re-implement those rates. That put a lot of pressure on me. I got chronicly stressed, slept badly, became distant from my colleagues. I was showing symptoms of getting worn out.

Meanwhile, my working environment hadn't changed. I have measured my E-factor this week and got 0.2. It's bad. It's not nearly close to the level required by the situation.

Peopleware mentions 0.4 as an acceptable level. A lead developer of an agile team reports an amazing 0.8 here.

With those hard numbers in hands, I have clear arguments to try once more to tell my management that our office structure is not adapted to software development.

Richard Stallman at KTH

February 28th, 2008

Tuesday evening I had the chance to be one among a crowd of students and other instances of varied species of computer enthusiasts to see Richard M Stallman in flesh and blood and listen to him. He gave a 2 hours speech at Kungl. Tekniska Högskolan (where I studied many years ago).

He had gathered quite a crowd:



And he got a hat:



And then he went on raving for 2 hours about why freedom is so crucial to the future of software. His talk was inspiring though not always easy to agree with. As a software developer I feel more receptive to the arguments raised to promote open-source (increasing code quality, avoiding key person dependency...) than to those promoting free code. But that's probably because I haven't been thinking much about the ethics of writing software.

Engineer++

December 4th, 2007

You can't be a corporate programmer and only know how to write code.

If writing code is all you can, you are doomed to oblivion in a dark corner of a forgotten underground office. Your social network will be the size of a handkerchief. Your job might be every programmer's wet dream, but you will not be a corporate programmer.

Being a corporate programmer is really about having at least two full time jobs.

One is about knowing half the computer languages ever created, understanding computers down to their tiniest cell tissues and mastering uplifting issues such as deadlocks, atomicity, float representation, mutexes and all their small friends, who all together make Kasparov chess games look like baby play. That's the easy part, mostly because it's fun.

Then there is your other job.

If you are developing a travel search engine, you will have to know everything there is to know about the travel business. Everything. And you will know it even better than the people whose only job is to work in travel agencies. Down to the smallest details.

If you are developing software for automatic trading on stock exchanges, you will end up being a financial analyst. You could even teach the damn stuff to financial analysts!

And that's the only way your job can be done. Anything less, and you work would be just sloppy.

When good is good enough

November 20th, 2007

Have you ever played this game: taking a large piece of paper, and sitting down beside it together with a couple of small children, and all together try to draw, say, a fairytale landscape?

If you are in it for the fun, you're gonna get loads of it.
But if your aim really is to draw a fairytale landscape, like those on fantasy books hardcovers, you are going to experience a great deal of frustration.

Now, a similar effect awaits you when writing software in a group: if you are a code perfectionist, you are going to be frustrated. Simply because normal developer teams are a mix of people with varying ambitions.

At some point, a developer has to let go his own feeling of responsibility toward the code, especially toward the code he has written himself. Some day, someone else will start changing your code, making it look weird and clumsy, introducing new ideas, taking charge of your code. It will happen and it is fine (as long as they don't break your regression tests).

But in my experience, letting go of your code is hard, especially code that you have polished during many hard-working hours. It's in human nature to care for what you create.

Once, during a conference long ago, I heard a developer call that 'ego-less' programming.

I am still trying to get better at that. I even wrote 'ego-less' on the corner of my desktop screen, to stare at it every day. But this is a typical example of self-refactoring: it takes time...