For anyone, @Arthur_Wolf or otherwise, who is interested in helping, here are some resources for scraping javascript-driven websites like wikidot:
- GitHub - tahama/scrapbookq: ScrapbookQ is a Firefox extension, which helps you to save Web pages and easily manage collections. Compatible with old ScrapBook extension, Suppert manage captures at browser sidebar. — firefox extension, might be enough to get data?
- How to Scrape Web using Python, Selenium and Beautiful Soup · Swetha's Blog — step by step, basically gives you a Python environment driving the browser to walk the site, so you just have to write a python script which @Arthur_Wolf knows how to do…
- Introduction to Web Scraping using Selenium | by Roger Taracha | The Andela Way | Medium — more of the same
- GitHub - internetarchive/brozzler: brozzler - distributed browser-based web crawler — The Internet Archive uses this
- GitHub - ikreymer/browsertrix: (Note: This repository is obsolete, please see the new Browsertrix webrecorder/browsertrix) Browser-Based On-Demand Web Archiving Automation — instance scraper, just add docker?
- Archiving web sites [LWN.net] — Linux Weekly News, as usual, points to quite a few resources, including some of these.
The Google+ importer I wrote is built around a core assumption of google users, which isn’t a facility wikidot provides, so it would be a false economy to try to re-use it. It will actually be easier for me to just start with something simple than try to work around it not being the same format.
Something like this would be easy to import:
{
'users': [
'arthurwolf': 'Arthur_Wolf',
],
'posts': [
{
'id': 'post-content-2205295',
'url': 'http://forum.smoothieware.org/forum/t-1081758/external-drivers',
'title': 'External drivers',
'date': '2015-01-12T21:25Z',
'author': 'harry11733',
'text': '<p>I plan to use some 570 oz stepper motors on a vertical mill CNC conversion. I am trying to decide what stepper motor drivers to use. It seems that the DQ860MA Stepper Motor Driver is the only one that has been reported as working with the smoothieboard.</p>
<p>I am interested in trying the digital steppers from Automation Technology, specifically the KL-5056D driver, which people have been happy with for this purpose. Do these drivers have any advantages or disadvantages relative to the DQ860MA? Will they even work with the smoothieboard? I don't really understand how these newer digital drivers work, for all I know it may be the same basic technology as the DQ860MA.</p>
<p>I am happy to pay a little more for the Automation Technology products in the anticipation of better customer support.</p>
<p>The current technology that people use for mill CNC conversions seems a bit nutty to me. They use the ethernet smooth stepper board to convert an ethernet signal to a parallel port in order to transmit the g-code from Mach 3/4 to the drivers. This seems circuitous compared to using a smoothieboard just to translate a g-code file, and much more expensive, but maybe I am missing some advantage that this other system offers.</p>',
'comments': [
{
'id': 'post-content-2205401',
'date': '2015-01-13T00:35Z',
'author': 'bouni',
'text': '<p>Hi,</p>
<p>I\'ve tried the DQ860MA external steppers with the smoothieboard and they work without any problems so far.</p>
<p><strong>vimeo.com / 115509540</strong></p>
<p>In my opinion the KL-5056D should work as well.<br>
In the <strong>kelinginc.net / KL-5056D.pdf</strong> , page 4 figure 3 you can see how you have to wire the drivers to the ST,DIR,EN and GND pins of the smothieboard. The internal resistors are 270Ohm and calculated for 5VDC signals, as far as i know the smoothieboard outputs only 3.3V, but for my DQ860MA that was not a problem, the optocouplers get only 8mA in this case but it seems to be enough to switch them.</p>
<p>Bouni</p>'
},
{
'id': 'post-content-2205875',
'date': '2015-01-13T10:21Z',
'author': 'arthurwolf',
'text': '<blockquote>
<p>I plan to use some 570 oz stepper motors on a vertical mill CNC conversion. I am trying to decide what stepper motor drivers to use. It seems that the DQ860MA Stepper Motor Driver is the only one that has been reported as working with the smoothieboard.</p>
</blockquote>
<p>Pretty much any external driver with a step/direction interface will work with Smoothieboard.</p>
<p>In some cases the driver will want 5V input, and Smoothieboard outputs 3.3V, but generally the drivers are fine with 3.3V even if rated at 5V. If 3.3V is not sufficent it\'s trivial to use a level shifter to bump the 3.3V up to 5V.</p>
<p>So generally, the vast majority of external drivers work out of the box with Smoothieboard.</p>
<p>I personally use the CW5045 and am pretty happy with it.</p>
<blockquote>
<p>I am interested in trying the digital steppers from Automation Technology, specifically the KL-5056D driver, which people have been happy with for this purpose. Do these drivers have any advantages or disadvantages relative to the DQ860MA? Will they even work with the smoothieboard?</p>
</blockquote>
<p>Yes.<br>
All of those drivers are very similar, and work with Smoothie.</p>
<p>Pretty much, if you read DIR+ DIR- EN+ EN- PUL+ PUL- on it, you know it\'ll work with Smoothieboard.</p>
<blockquote>
<p>The current technology that people use for mill CNC conversions seems a bit nutty to me. They use the ethernet smooth stepper board to convert an ethernet signal to a parallel port in order to transmit the g-code from Mach 3/4 to the drivers.</p>
</blockquote>
<p>Yeah that\'s just a relic of the 80s :)</p>
<p>Smoothie is the modern way to do it :)</p>'
}
]
]
]
}
The user mapping to a string means to attach those posts to an existing makerforums user. You can fill those in where you know the mapping. Otherwise I’ll just create a new user whenever I need to. Those won’t give people magic edit rights like they do for G+ posts imported, but it’s a support tool for @Arthur_Wolf’s forum, and it’s the best I can do. I think that referencing the original URL and author inasmuch as we have that information would comply with the license terms posted there; at least, that’s my intent.
The URL I put in the example shows the source page that I used to create the example. The ID I put in there is from what they put on the div, and Discourse imports really like unique IDs from the source system.