Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: how to use this library with Peuppeteer? #421

Open
zirkelc opened this issue Apr 22, 2024 · 11 comments
Open

Question: how to use this library with Peuppeteer? #421

zirkelc opened this issue Apr 22, 2024 · 11 comments

Comments

@zirkelc
Copy link

zirkelc commented Apr 22, 2024

Hi,

I stumbled upon this library via this post. The post seems to be a bit outdated as the library has changed, for example it doesn't export the @duckduckgo/autoconsent/dist/autoconsent.puppet.js any longer.

Is there an official how-to guide on how to use this library with Puppeteer? Or at least some up-tp-date code example?

@muodov
Copy link
Member

muodov commented Apr 22, 2024

@zirkelc there's no "official" guide, but you basically need to hook into the messaging API.
There's a module in the tracker-radar-collector project, that might be helpful. It's up to date, but I'm guessing you'd need to modify it to your needs.

@zirkelc
Copy link
Author

zirkelc commented Apr 22, 2024

@muodov thank you for the quick response. I was actually hoping it would be easier to get it to work with Puppeteer 😢

I will check out the mentioned project, it looks like a good start point.

@abhisheksurve45
Copy link

Hey, any luck running it with puppeteer?

@zirkelc
Copy link
Author

zirkelc commented Jul 25, 2024

@abhisheksurve45 no I didn't go any further with this. I was able to run https://github.com/duckduckgo/tracker-radar-collector/blob/main/collectors/CMPCollector.js locally with Puppeteer, but it's using a quite old version of Puppeteer (v10) and I didn't test it with a newer version.

@muodov
Copy link
Member

muodov commented Jul 25, 2024

Sorry to hear that @zirkelc . Unfortunately, investing in the collector updates is not our priority right now, so I would recommend against using tracker-radar-collector for side projects, if all you need is run autoconsent.
To run autoconsent in puppeteer, you can make your own content script, import autoconsent there and implement the background part like I described in the previous comment. Feel free to ask questions here if something is unclear.

@zirkelc
Copy link
Author

zirkelc commented Jul 26, 2024

Hey @muodov no reason to be sorry of - I'm glad a library like this even exists! :-)

I didn't investigate it further, because I was able to simply block the cookie banners, etc. from loading/appearing in the first place. That's enough for me at the moment, even though it's not 100% reliable.

Newer version of the headless Chrome (now called Chrome for Testing) allow to load extensions. I verified it on my local machine on macOS. However, it doesn't work yet for Linux builds. My hope is that I can simply use this as Chrome extension some time in the future.

@teammakdi
Copy link

Hi, I'm looking to try to opt out of cookies using puppeteer

const puppeteer = require('puppeteer');

const fs = require('node:fs');

const baseContentScript = fs.readFileSync(
    require.resolve('./node_modules/@duckduckgo/autoconsent/dist/autoconsent.playwright.js'),
    'utf8'
);

const contentScript = `
window.autoconsentSendMessage = (msg) => {
    window.cdpAutoconsentSendMessage(JSON.stringify(msg));
};
` + baseContentScript;

(async () => {
    const browser = await puppeteer.launch({ 
        headless: false,
        devtools: false
    });
    try {
        const page = await browser.newPage();

        await page.setViewport({ width: 1280, height: 1024 });
        const client = await page.createCDPSession();

        await client.send('Page.enable');
        await client.send('Runtime.enable');


        client.on('Runtime.executionContextCreated', async ({context}) => {
            try {
                await client.send('Runtime.evaluate', {
                    expression: contentScript
                });
            } catch (e) {
                console.log(e)
            }
        });

        await client.send('Runtime.addBinding', {
            name: 'autoconsentReceiveMessage'
        });

        await client.send('Runtime.evaluate', {
            expression: `autoconsentReceiveMessage({ type: "init", config: ${JSON.stringify({
                enabled: true,
                autoAction: 'optOut',
                disabledCmps: [],
                enablePrehide: true,
                detectRetries: 20
            })} })`
        });


        await client.send('Runtime.evaluate', {
            expression: `autoconsentReceiveMessage({ type: "optOut" })`
        });

        await new Promise(r => setTimeout(r, 3000));

        await page.goto('https://www.npr.org/sections/national', { waitUntil: ['load', 'domcontentloaded', 'networkidle0'] });

        await new Promise(r => setTimeout(r, 5000));

        await page.screenshot({ type: 'png', path: 'screenshot.png' });
    } catch (e) {
        console.log(e)
    } finally {
    }
})();

But it is not working, I've confirmed that the autoconsent is able to accept/reject cookies on this page via ghostery.

Can you please help what possibly is wrong.

Thanks.

@muodov

@muodov
Copy link
Member

muodov commented Aug 5, 2024

@teammakdi I think the order of events is off in the script, for example, it sends the init message before actually loading the page. But also, the logic and data format are wrong: you need to send initResp instead of init for example.
If you want to go this way, I suggest making sure you really understand the API docs, and study the implementation of playwright tests. If it seems to much work, I'd consider trying to load autoconsent extension as a whole, like @zirkelc suggested above.

@teammakdi
Copy link

Hi @muodov

I went through the API docs https://github.com/duckduckgo/autoconsent/blob/main/playwright/runner.ts, however wasn't able to get it running. Primary I'm getting undefined in the frame and message in autoconsentSendMessage callback.

Example script

const puppeteer = require('puppeteer')

const fs = require('node:fs')

const baseContentScript = fs.readFileSync(
  require.resolve('./../node_modules/@duckduckgo/autoconsent/dist/autoconsent.playwright.js'),
  'utf8'
);

(async () => {
  const browser = await puppeteer.launch({
    headless: false,
    devtools: true
  })
  try {
    const page = await browser.newPage()

    await page.setViewport({
      width: 1280,
      height: 1024
    })

    await page.exposeFunction('autoconsentSendMessage', ({ frame }, msg) => {
      console.log(frame, msg)
    })
    await page.goto('https://www.npr.org/sections/national', {
      waitUntil: ['load', 'domcontentloaded', 'networkidle0']
    })
    await new Promise(resolve => setTimeout(resolve, 2000))
    
    await page.evaluate(baseContentScript)
    page.frames().forEach(async (frame) => {
      await frame.evaluate(baseContentScript)
    })
    page.on('framenavigated', async () => {
      await page.evaluate(baseContentScript)
    })
    await new Promise(resolve => setTimeout(resolve, 2000))

    await page.screenshot({
      type: 'png',
      path: 'screenshot.png'
    })
  } catch (e) {
    console.log(e)
  }
})()

Also I don't want to load it as in extension, since I want to accept/reject keys for specific domains and not on global puppeteer/chrome level.

@ItayElgazar
Copy link

@teammakdi did you manage to solve it somehow?

@freiondrej
Copy link
Contributor

Hey @teammakdi, in case you'd be able to provide a sample repo for reproducing your error, I could give it a go :) I did some experimenting recently with overriding the autoconsentSendMessage here - it's probably something quite different than what you need, but I'm sharing it to let you know the overriding might be worth exploring :) I have no experience with puppeteer myself, but a sample repo from you would let me try to get your usecase working 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants