Expand on the example to allow users to pass a keyword as an argument, and save the resulting HTML file to the local machine
If you like to code it yourself, follow the steps in the Development log
$ npm install
$ node index.js "code & coffee vancouver"
If you followed the steps in ecs_s3_scraper_starter, and used yarn instead of npm, then
$ yarn add command-line-args
OTHERWISE
$ npm install --save command-line-args
Command-line-args npm package
const commandLineArgs = require('command-line-args')
const optionDefinitions = [
{ name: 'keyword', alias: 'k', type: String, defaultOption: true },
]
const options = commandLineArgs(optionDefinitions)
Note: setting defaultOption to true means that we don't have to use -k option. It will assume that the next argument is the "keyword"
Change
.type('#search_form_input_homepage', "code and coffee vancouver")
to
.type('#search_form_input_homepage', options.keyword)
$ node index.js "code & coffee vancouver"
node index.js -k "code & coffee vancouver"
const fs = require('fs');
.then(function () {
return nightmare.evaluate(function() {
return document.querySelector('body').innerHTML;
})
})
.then(function(page) {
fs.writeFileSync(options.keyword+'.html', page);
})
Replace
.end()
with
.then(function(){
return nightmare.end();
})
$ node index.js "code & coffee vancouver"
Nightmare.js allows us to open up the Chrome Dev Tools, so we can easily debug the issue. It is pretty handy especially when stuff has changed on the web page.
Change how we initialize Nightmare in index.js to the below.
var nightmare = Nightmare({
show: true,
openDevTools: {
mode: 'detach'
},
dock: true,
});
Oh, the problem is that our code is waiting for the presence of a div with ID zero_click_wrapper that contains a div with class c-info__title
But, this is not the case anymore. So, waiting for it unfortunately doesn't work. However, there is a div with class content-wrap, so let us use that.
Change the .wait line in index.js to the below.
.wait('.content-wrap')
$ node index.js "code & coffee vancouver"