Trunk. Development. Testing. - Ros-dev

16 Oct 2006


      Hello,
I have thought long time before writing this email, and finally it's  
time to write it...
N.B. I don't intend to hurt or offense anyone in this email. And  
there is not so much use from a project leader who only says "yes",  
"good", "agreed".
We all see ReactOS became quite a big thing. Big everywhere - source  
code, goals, development time... I think one would wish every FOSS  
project to gain that level and speed of development.
If someone of you played Civ during youth, you certainly remember  
that your cities can't grow, grow and grow without any efforts -  
people become unhappy, city is ovecrowded, demanding sanitation and  
special facilities in order to continue growth. If you don't provide  
those needed facilities, the city is going to at least stop its  
development, and usually goes into disorder due to unhappines.
Similar thing happens in software development too. Once project  
reached certain size (even in terms of SLOC), developers' "tools"  
should be upgraded and enhanced.
That's it for theory. Let's have a look at ReactOS, and more  
specifically its trunk.
I'm getting more and more complaints that trunk is unstable, remains  
unstable, and most of the time doesn't even boot, and the time came  
to actually solve this problem, once and for ever.
As an example, I will show only one of common scenario:
Some developer commits his code, which works for him on his, say,  
qemu but doesn't work for another dev on his real hardware. That  
developer continues to commit code, and in some time encounters that  
dev whose machine doesn't boot reactos and he says: "hey,  
you ...ng ...rd, you broke trunk!". I don't even dare to cite which  
reply follows, because it would take a few pages to quote. The  
developer who broke trunk starts to regress test, reverting revision  
by revision, spending time to identify the regressed revision (this  
may take a few days, and we don't have fulltime paid developers), and  
then finally finds the bug, fixes and commits. By that time, another  
developer commits code with a regression in other place, and trunk is  
still unbootable. And blaming continues, flamewars start, who broke  
what, instead of actually enjoying the development process.
Pretty unproductive, yeah? And you, the reader, are definately  
wondering - what is the solution?
I answer: There are a few solutions, but as always I will list only a  
few
1) "Wine"-method. There is one leader who decides what, how and when  
to commit. He maintains the tree, he makes sure tree is in good  
shape, he spends all of his time to do testing, merging, reverting,  
remerging.
It works really good for Wine. Will it work good for us? I am in  
doubts, really.
2) Improving our current development system. That's the direction I  
would like to use.
Major and the first improvement needed is testing. Testing often,  
testing early, automating testing, getting more people to test,  
regress test and feed results back to developers.
Buildbot was a step in this direction, but more steps are needed.
If someone broke booting, a proper recognized complain would be  
"Revision NNXXY regresses 3rd boot with a bugcheck code NN stack  
trace attached". The sooner this information is available and  
developer is notified - the faster this bug will be solved.
If developer is inaccessible within a long period of time, a decision  
to revert this revision might be taken (that should be a really rare  
case).
This shouldn't come to absurd of course, I can tolerate having trunk  
broken for a few hours certainly, when developer is working on a  
certain feature, and, well, of course he may mistake, not fully  
commit something and so forth.
But having it half-broken for 2 weeks AND noone took time to regress  
test and find the guilty revision.
Or yet another example, fortunately last for this email. I  
implemented an "alpha" version of usb mouse driver, along with that  
recently incorporated NT4 compatible usb driver. And assumed, and  
even asked in irc to test it - it's not that hard, but gives me a  
chance to fix not-so-obvious issues before it'll be enabled by  
default in trunk and will drive people crazy with some regress which  
I didn't see on my machine, and, at last, even hearing "yay, your usb  
mouse driver works!" just simply motivates me to finish e.g. a usb  
keyboard driver, find and fix bugs, etc.
Conclusions.
1) To improve development quality we must improve testing. Automate,  
gather bigger testing team, etc.
2) Developers must be way more careful during commit. Just remember a  
simple rule: Trunk is not a wastebin where you dump code to. It's  
quite the opposite.
Feedback.
Your feedback is greatly encouraged and awaited. Feel free to discuss  
this in this ML.
With the best regards,
Aleksey Bragin
ReactOS Project Coordinator