Title GHTorrent: GitHub's data from a firehose URL http://dl.acm.org/citation.cfm?id=2664449 Year 2012 Citations 65 Versions 15 Cluster ID 9159843476657694384 Citations list http://scholar.google.com/scholar?cites=9159843476657694384&as_sdt=2005&sciodt=0,5&hl=en Versions list http://scholar.google.com/scholar?cluster=9159843476657694384&hl=en&as_sdt=0,5 Excerpt Abstract A common requirement of many empirical software engineering studies is the acquisition and curation of data from software repositories. During the last few years, GitHub has emerged as a popular project hosting, mirroring and collaboration platform. GitHub provides an extensive rest api, which enables researchers to retrieve both the commits to the projects' repositories and events generated through user actions on project resources. ...