Go to file
2015-09-08 21:57:58 -05:00
services initial commit 2015-09-08 21:56:08 -05:00
enum.py initial commit 2015-09-08 21:56:08 -05:00
Issue.py initial commit 2015-09-08 21:56:08 -05:00
main.py initial commit 2015-09-08 21:56:08 -05:00
Project.py initial commit 2015-09-08 21:56:08 -05:00
README.txt updating readme 2015-09-08 21:57:58 -05:00
Release.py initial commit 2015-09-08 21:56:08 -05:00
Service.py initial commit 2015-09-08 21:56:08 -05:00
Wiki.py initial commit 2015-09-08 21:56:08 -05:00

# codescrape

Version 1.0

By: Nathan Adams

License: MIT

## Description

This library is to be used to archive project data. Since with the announcement of Google Code going to archive only - I wanted to create a library where you can grab source data before it is gone forever.

Use cases include:

Archive projects due to:

- Hosting service shutting down
- Authorities sending cease-and-desist against provider/project
- Historical/research/ or educational purposes

## Usage

Currently srchub and google code are supported. To use:

    from services.srchub import srchub
    shub = srchub()
    projects = shub.getProjects()
	
or for google code

    from services.googlecode import googlecode
    gcode = googlecode()
    project = gcode.getProject("android-python27")
	
Sourcehub library will pull all public projects since this list is easily accessed. Google Code does not have a public list persay. And I didn't want to scrape the search results, so I developed it to require you to pass in the project name. If you were to get your hands on a list of google code projects you could easily loop through them:

    from services.googlecode import googlecode
    gcode = googlecode()
    for project in someProjectList:
	    project = gcode.getProject(project)
        # do something with project
		
the project data structure is as follows:

project

- getRepoURL() -> Returns the URL of the repo
- getRepoType() -> Returns the type of repo (git, hg, or SVN)
- getReleases() -> Returns all downloads related to the project
- getIssues() -> Returns open issues
- getWikis() -> Returns wikis