Starter Kit for running Headless-Chrome by Puppeteer on AWS Lambda.
$ git clone -o starter-kit https://github.com/sambaiz/puppeteer-lambda-starter-kit.git your_project_name
By executing SLOWMO_MS=250 npm run local
, you can check the operation while actually viewing the chrome (non-headless, slowmo).
Lambda's memory needs to be set to at least 384 MB, but the more memory, the better the performance of any operations.
512MB -> goto(youtube): 6.481s
1536MB(Max) -> goto(youtube): 2.154s
Run npm run package
, and deploy the package.zip.
Due to the large size of Chrome, it may exceed the Lambda package size limit (50MB) depending on the other module to include. In that case, put Chrome in S3 and download it at container startup so startup time will be longer.
Run npm run package-nochrome
, deploy the package.zip, and set following env valiables on Lambda.
CHROME_BUCKET
(required): S3 bucket where Chrome is putCHROME_KEY
(optional): S3 key. default:headless_shell.tar.gz
This kit includes Chrome built by myself because official build Chrome installed by Puppeteer has problems about running on Lambda (missing shared library etc.).
If you want to use latest chrome, run chrome/buildChrome.sh on EC2 having at least 16GB memory and 30GB volume.
See also serverless-chrome.
Once you build it, link to headless_shell.tar.gz
in chrome
dir.
The best way to deploy right now is to upload package.zip file to S3 and set the link to Lambda.
Testing in Lambda:
- Normal request
{
"url": "https://google.com",
"trackResponses": false,
"getCookies": false,
}
- Request with XHR responses and cookies (shopee.vn is SPA written in React)
{
"url": "https://shopee.vn/Astaxanthin-i.25321200.413097508",
"trackResponses": true,
"getCookies": true,
}
Lambda上でPuppeteer/Headless Chromeを動かすStarter Kitを作った - sambaiz-net