If I want to scrape some data I go to Outwit and open the page from within Outwit itself. I like to think of Outwit as a geeky internet browser which allows you to see what's behind the page as well as what you can normally see. For this example I'm going to search for all instances WILLIAM TILLIN in the 1841 England census to keep it small - 3 records.
Normally if I wanted to collect the information for each of these records I would go into each individual record as below and type the information from the screen into my database - or at best copy and paste the information using something like a Firefox addin.
But Outwit works slightly differently.
This is the page of data that I want to "scrape" and this is the page that my scraper will go to and collect the data. I've written the scraper myself based on the html behind this page but I won't go into too much detail here about how I did that as it will make this a really long post. I can do more of an explanation in another post if that would help anybody.
So using the back button in the top left hand corner (just like other web browsers) I get back to my original search results. Then I click on the links button (circled in red below) and this gives me a completely different view of the page.
So I highlight the rows and tell it to go and explore using my 1841 scraper (making sure I don't overutilize the resources of the site by exploring too many records too quickly) and this is the data it brings back in about 10-15 seconds
You can see the detailed information at the bottom based on the 3 records from the original search. The data can then be exported as a csv or excel file and added to the database.
I hope this makes sense - I've found it exceptionally useful for census data as well as civil registration data. I would not have all the data I have if I'd had to collate the information in the traditional way.
I'd be happy to provide some more examples or go into more detail - so please leave any questions in the comments below or ask me on twitter - you can find me @Wibblingjo - or on my google+ page +Wibbling Jo Genealogy