Download: Latest Version from Skydrive
iScrape is a simple program that will read web pages and grab the text from them. It then saves them as plain text files and opens them with notepad. Simple.
This simplifies the process of grabbing a large number of ebooks from sites like fanfiction or similar.
This ignores most other data such as ads, images and more. For this reason the scrapper will not work on sites that use a significant amount of scripts or content (images, forms etc).
iScrape supports both console commands passed in via command line or passed in after the program has been run. If that makes no sense to you, just ignore it.
When you launch iScrape it will provide you a prompt to enter an address. This may either be the url of a webpage that you wish to grab the text from or the location of a plain text (.txt) file that contains a list (seperated by spaces) of the urls you want to read. If you don't understand the second version a sample "test.txt" file has been included that downloads 4 chapters of a sample book from FanFiction.
If you enter an invalid url or txt file the program will break and exit. If that happens all you need to do is fix whatever was wrong in your txt file and restart.
If you run into problems, just let me know. I plan on expanding this far beyond it's current scope.
This program takes advantage of the HTMLAgilityPack
which is not developed by me.
Theoretically this should make getting all of those text files for your Zune eReaders a lot easier.
NOTE: Files are saved with as "[TITLE].txt" where [TITLE] is the title of website you downloaded them from. For those of you that don't know, the title of a page is what appears in your tab. This maintains the spaces in the name.
NOTE 2: This is written in C# and thus should work only for Windows machines.
I have tested this on FanFiction. Further tests will be done shortly but I figured I'd get this out there. This may have errors when reading from other sites.
An example of the wikipedia arictle on "Dice"
If you run into problems or have questions feel free to ask me.