Wednesday, April 01, 2015

God complex - why open models will win

Engineering and science can never be about religion, they are both about trial and error, empirical evidence supporting trials, precision, and formulating math behind all this. Its really easy to forget this though, specially if you've hired really good engineers / scientists. With good engineers / scientists you might cut corners or simply expect and assume that you'll always have the best answers possible on board. A good thesis can only be good if it really covered all possible known grounds and is providing an in depth analysis that likely was never considered before. See my article and review of the Big Bang theory for my high bar expectation for what I mean by good scienceBecause of all this with the rapid pace of change in science and technology, knowledge and information flow I suspect there should be a limit at which closed development models can outpace open development models, although I have no evidence for this I believe the reasoning for this should be relatively trivial to follow. Folks who disagree with this might find it harder to prove the counter, which leaves me content without having to provide a full proof. I have found that this particular issue in Engineering / Science has been best described by Tim Harford in a Ted Talk titled "God Complex" and highly encourage anyone who might have hesitation about the above "open model outpacing closed models" premise to go watch it. I'll use this premise in this post, just an example, to argue that for instance, open hardware development should outpace closed hardware development models -- just as open software development models very likely already outpace closed proprietary software development models (we can't prove this as we don't have math on private development models). I'll go into details of my conjecture next and provide a brief guideline to folks who want to test this conjecture on open hardware development.

Engineering is not supposed to be easy, its fucking hard, and if you have it any other way you're fooling yourself that what you are doing is Engineering. Kernel development is not supposed to be easy, and considering that on Linux we're engaging with the entire planet openly on the largest collaborative development project on the planet, its no surprise that the engineering on Linux has a steeper curve than other average software engineering projects. Even though we've prided ourselves on informality on much of our engineering practices over time our growing pains have taught us a few principles and best practices to help us both scale and to more effectively engineer collaboratively. A few easy to follow examples of this are:

  • The practice of using Subsystem Maintainers, where components of parts of our software are broken down into components and folks then are in charge to upkeep that component. Linus just pulls all the strings of all maintainers together during the merge window.
  • The Development of the Developer Certificate of Origin (DCO) whereby after some legal considerations we realized its best to throw in some Signed-off-by / provenance guarantees on software in such a way that it would allow us to upkeep our pace of development.
  • A Code of Conflict to enable us to deal with unfortunate extreme mishaps on the outright difficult nature of engaging with grumpy overloaded maintainers and community on the open peer review process.
Many software projects have learned from Linux. The Subsystem Maintainers model is prevalent, although likely not invented on Linux, but as I've described in a previous post before the DCO is also heavily embraced by other projects already and other projects are encouraged to use it now thanks to our effort to separate it from Linux. Many projects have Code of Conflicts agreements, that is not unique to Linux. There's one aspect about the Code of Conflict that is important to highlight and goes in only as implicit but that I'd like now to make explicit and use as a primary premise for the reason for this post. Here is the language I'd like to highlight:
Your code and ideas behind it will be carefully reviewed, often resulting in critique and criticism.  The review will almost always require improvements to the code before it can be included in the kernel.  Know that this happens because everyone involved wants to see the best possible solution for the overall success of Linux.
I'm going to summarize this as: Engineering is hard as fuck, expect people to call you out on your shit. Deal with it, but if you feel we're unreasonable you can tap out. But most importantly: Expect your first iteration on ideas to likely not be correct and require improvements. Even the most seasoned developers should expect this. Before working for a purely software company I used to work at a hardware company, Atheros, and the role I engaged in was unique given that Atheros was providing full ASIC silicon designs on 802.11 technologies without requiring any CPU on the devices themselves. This meant that contrary to most 802.11 devices in the industry we worked without any firmware, all operations of the device were completely transparent to the device driver. Since I worked on an open device driver that meant all 802.11 hardware operations were completely open and transparent to the community whereby device drivers that relied and used on firmware would have hardware operations performed behind the scenes offloaded on the device's own CPU / proprietary firmware. Before I joined Atheros I used to believe that Atheros had the best 802.11 hardware in the industry. After I joined Atheros and particularly, as other peers got hired by other 802.11 silicon companies and we collaborated, I became convinced that it was not just Atheros' unique hardware that made it stand out.

The success to the quality of support of Atheros' 802.11 devices can also be attributed to:

  1. The full ASIC design nature of it (not requiring firmware) and how hardware issues were punted out to the device driver that made the device operate much better than others
  2. A strong community commitment / know-how and engagement
One thing which I'd like to highlight from the above graph is that at times the community was performing more contributions to the ath9k device driver than Atheros (later known as QCA). Both of the above are instrumental for a healthy openly developed device driver but I cannot stress enough how critical to success it was for not requiring firmware. I told folks repeatedly that we should not feel embarrassed about having hardware bugs. We should accept this as part of the nature of hardware design and silicon development. Its the rate at which you can fix these, even if through software workarounds, which will ultimately really create the best experience for users. If you have firmware the pipeline for fixes requires engaging with a team of engineers inside a company, and the time to fix issues there typically requires a significant amount of time. Without firmware even the community was able to participate in creating fixes for extremely complex issues, and this is extremely important for complex technologies such as 802.11. As we combine more RF technologies and things get more complex we will have no other option to work and engage with the community, thinking anything contrary to this make you fumble and fall into the "God complex" trap.

At Atheros, during the good' ol days, we were able to leverage off of the belief that we'd gain more successful contributions / healthy development model by opening up firmware on other devices where firmware was actually needed, we first tested this with carl9170 and later with ath9k_htc, both of which did require firmware but for which we managed to open source its firmware for. I believed our efforts to be pivotal, and an engaged open enthusiast reader might wish to perform metrics on carl9170 and ath9k_htc to help evaluate the impact of quality on software over openness.

At the last Linux wireless summit that I actively participated in, before joining SUSE, it was made clear that all manufacturers were moving away from full ASIC designs for 802.11 and that all silicon companies were going to be using proprietary firmware. There are a lot of reasons for this, some of this has to do with the combination of different RF technologies (not just 802.11), but nevertheless the saddest part to me of all this was that the good lessons learned from the success of fully open drivers and open firmware models were not being seriously considered by future 802.11 device drivers and architectures. Part of this is the above arguments for "goodness" has no direct hard science associated with it, its why I ended up working towards a hard science for ethical attributes.

Lacking hard science for proof for "goodness" might seem like a bad thing, but its also a chance for great opportunity. New startups and folks designing new hardware who already "get it" and do not have any hard requirements to tie themselves down with legacy archaic business requirements have a full open arena for exploration, this is the best situation to be in. Venture capitalism should be easily able to prove my conjecture by a few simple test cases. At least within the realm of open hardware designs, since existing silicon companies (not startups) might face the dangers of free software, they should consider using hoards of unused / closeted / legacy designs and testing new innovative approaches with the community. And then there's the folks who have been perfecting collaborative development models: companies / organizations which have already been perfecting open collaborative development models have much to bring to the table to new startups / business models which perhaps never had explored such things. There's room for a lot of experimentation and trial and error. I'm happy for my conjecture to be disproved given that all this is not about religion, but rather the best fucking engineering possible. I remain optimistic though.