|
|
(One intermediate revision by the same user not shown) |
Line 1: |
Line 1: |
| '''ScraperSync''' is a tool for maintaining an FOIwiki page as a mirror of a dataset from [http://www.scraperwiki.com/ ScraperWiki]. It attempts to preserve markup and changes in the page while propagating changes from the dataset. | | '''ScraperSync''' no longer works since ScraperWiki ceased being useful. |
| | |
| ==Dataset requirements==
| |
| | |
| The dataset must have a table called "swdata" containing a column called "name". The contents of this column form the texts of entries in a bulletted list in the wiki page.
| |
| | |
| ==List requirements==
| |
| | |
| The start of the maintained list is marked by a special comment like this:
| |
| | |
| <pre><nowiki><!-- ScraperSync start { "scraper": "thing" } --></nowiki></pre>
| |
| | |
| After the word "start" is a JSON object which configures ScraperSync. It can contain the following entries:
| |
| | |
| * <tt>scraper</tt>: names the scraper from which the data should be pulled.
| |
| * <tt>sort</tt>: if set to <tt>name</tt>, sorts list entries by name.
| |
| | |
| ScraperSync understands several different list formats, and tries to remove markup and notes to work out what public authority is named by the item.
| |
| | |
| The maintained list is ended by another special comment:
| |
| | |
| <pre><nowiki><!-- ScraperSync end --></nowiki></pre>
| |
| | |
| ==Output==
| |
| | |
| ScraperSync's main output is a new version of the page source. It attempts to:
| |
| * cross out bodies that no longer appear in the scraper
| |
| * add bodies to the list that appear in the scraper but not in the page
| |
| * mark as "gone" any bodies with the {{tag|defunct}} tag in WhatDoTheyKnow
| |
| * link bodies to WhatDoTheyKnow if they aren't already linked
| |
| | |
| ==Creating a new page==
| |
| | |
| ScraperSync has a mode for creating an entirely new page from a dataset. This can be most easily activated by [http://scraperwikiviews.com/run/foiwiki_scrapersync/ running it with no options] and choosing "Create new" and entering the URL name of the Scraper to use as the source.
| |
ScraperSync no longer works since ScraperWiki ceased being useful.