Open Source Software (OSS) often relies on large repositories, like SourceForge, for initial incubation. The OSS repositories offer a large variety of meta-data providing interesting information about projects and their success. In this paper we propose a data mining approach for building classifiers on the OSS meta-data provided by such data repositories. The classifiers learn to predict the successful continuation of an OSS project. The ‘successfulness’ of projects is defined in terms of the classifier confidence with which it predicts that they could be ported in popular OSS projects (such as FreeBSD, Gentoo Portage). The classifiers can assist with predicting the future of any submitted OSS project(i.e. whether the project will be ported by other popular OSS projects). We argue that this new aspect of measuring successfulness of OSS projects can be added as an additional metric in previously proposed models of OSS successfulness. We have experimentally evaluated the proposed approach in the SourceForge and the FreshMeat project data collected by the FLOSS project. The reported results are promising and demonstrate the significance of the information that OSS repository meta-data can provide.
|