Friday, February 15, 2008

A Deeper Dive into FOSSology

In December I blogged about HP’s FOSSology, a framework that enables the scanning of code. The first plug-in for this framework looks for license text within source code using templates for about 270 licenses and license variants.

I did not see deep investigative reporting in the press coverage of FOSSology, so I thought I'd write a blog post that does a deeper dive.

Why did HP open source FOSSology? HP transformed FOSSology into a open source project because some of their customers learned that HP had an internal tool that HP intended to offer as a product, and they asked HP to open source it. These customers wanted a free tool and HP wanted to generate services revenue to recover any costs associated with their efforts.

We, at Black Duck, feel that there is always room for free and open source (FOSS) code scanning tools in our market. However, we feel that a FOSS offering is a good thing only if the code committers are fully committed. SourceForge is rife with projects with only headers and a small pile of code that is now dormant or unfinished because of the lack of commitment.

I expect HP to apply resources to this FOSSology opportunity. I also expect that the increased awareness in the marketplace created by HP’s offering will increase interest in Black Duck solutions.

Most Black Duck customers want to get out of the business of developing and maintaining homegrown compliance management tools and are instead adopting commercially supported products that give them the equivalent of the Good Housekeeping Seal. A FOSS project does not accomplish this objective, even one – maybe especially one – that was started by HP. While a FOSS project will provide the benefits of community development, it will not allow an enterprise to eliminate their internal efforts. Most enterprises want to deploy their engineering resources on building new products or applications that contribute to their competitive advantage, not internal tools. This may be especially true during a recession, at least if the past is a guide to the present. We believe that only a small number of organizations and corporate developers will be interested in the FOSSology approach.

By comparison, Black Duck’s flagship product, protexIP, has information on over 1,245 different licenses. protexIP does code analysis of a code base (as opposed to code scanning) using multiple analysis techniques – including code matching against a KnowledgeBase (KB) of more than 152,000 open source projects and 485 million (to be exact: 485,191,760) source files, dependency analysis, and license text search. We maintain the integrity of protexIP’s KB by maintaining it ourselves.

The Black Duck approach also results in the KB being the most comprehensive and relevant code analysis product on the market. Lawyers, developers, and managers seeking an answer to what's in the code base receive answers because protexIP is the best available technology (BAT) on the market. BAT is a legal standard, not just an IT goal.

An engineering-oriented, and by extension legal, example of protexIP as BAT occurs when an application uses binary files or in cases where source code has been cut and pasted but copyright text and license headers not retained. A simple license text search will not work in any of these cases.

In addition, what most enterprises are looking for, rather than just a list of licenses discovered, is an accurate report at the conclusion of the code analysis process that identifies each component in use, its version, the corresponding license, how the code is being used, and whether that usage is approved. As most of the FOSS community understands, the impact of FOSS licenses is not determined exclusively by whether the code is present or not – it’s in how the code is mixed together, and how it’s ultimately used.

For example, the use of GPL-licensed tools is very different from the use of GPL-licensed code in an ISV’s product. So in addition to its analysis capabilities, protexIP enables customers to understand these relationships, and to put the code through an approval process whereby each usage is well-understood and approved.

Ultimately, protexIP is an enterprise-class solution that enables developers, lawyers, and management to share information about software-as-an-asset. It’s not a scanning tool -- it’s a lot more. Many of the stories about FOSSology didn't get down to this level of detail about code analysis, but corporate customers will. We look forward to it.

No comments: