Skip to content

odeke-em/crawlers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Emmanuel Odeke
Oct 11, 2015
a8722a1 · Oct 11, 2015

History

83 Commits
Jul 25, 2014
Sep 6, 2014
Sep 6, 2014
Jul 21, 2014
Apr 14, 2014
Jun 22, 2014
May 25, 2014
Sep 6, 2014
Jul 15, 2014
Oct 11, 2015
Jul 25, 2014
Sep 6, 2014
Oct 11, 2015
Oct 11, 2015

Repository files navigation

Crawlers for various websites mostly news providers

Basic idea, fetch content of a web-page and examine

the text present, extracting matching keywords/text

eg by file extension name or domain.

Once links are extracted, if files, they are

downloaded, or queued up on the cloud for workers to

actually perform the downloads.

  • To use the local based downloader:

    ++ Works on any version of Python >= 2.X

    python fileDownloader.py

  • To use the cloud based job queuer:

    ++ So far built for Python3.X

    python3 targetForCloud.py

About

Crawlers for mostly news providers

Resources

Stars

Watchers

Forks

Packages

No packages published