Skip to content

Commit

Permalink
speedreader: Update page tests
Browse files Browse the repository at this point in the history
Updating to kuchikiki 0.8.2 changed the order of attributes in
the distilled page output. Update the expected test data so the
SpeedreaderRewriterPagesTest.CheckPages unit test passes.
  • Loading branch information
rillian committed Sep 19, 2023
1 parent 32203ba commit 760ab8f
Show file tree
Hide file tree
Showing 106 changed files with 652 additions and 652 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ <h3>

</div>

<p><a href="https://4.bp.blogspot.com/-TmFaWx0UoOM/XG7FkD3y42I/AAAAAAAAOe0/D8cAfvL486A6oU4zHmAeQF-Ylo8FytgHQCLcBGAs/s1600/download.png" imageanchor="1"><img data-original-height="252" data-original-width="374" height="268" src="https://4.bp.blogspot.com/-TmFaWx0UoOM/XG7FkD3y42I/AAAAAAAAOe0/D8cAfvL486A6oU4zHmAeQF-Ylo8FytgHQCLcBGAs/s400/download.png" width="400"></a></p>
<p><a href="https://4.bp.blogspot.com/-TmFaWx0UoOM/XG7FkD3y42I/AAAAAAAAOe0/D8cAfvL486A6oU4zHmAeQF-Ylo8FytgHQCLcBGAs/s1600/download.png" imageanchor="1"><img width="400" data-original-height="252" data-original-width="374" height="268" src="https://4.bp.blogspot.com/-TmFaWx0UoOM/XG7FkD3y42I/AAAAAAAAOe0/D8cAfvL486A6oU4zHmAeQF-Ylo8FytgHQCLcBGAs/s400/download.png"></a></p>

<p>
Let’s train the network via gradient descent. JAX’s random number generator is set up differently than Numpy’s, so to initialize network parameters we’ll use the original Numpy library (onp) to generate random numbers. We’ll also import the tree_multimap utility to easily manipulate collections of per-parameter gradients (for TensorFlow users, this is analogous to nest.map_structure for Tensors).
Expand All @@ -134,7 +134,7 @@ <h3>

<p>
Evaluating our network again, we see that the sinusoid curve has been correctly approximated.</p>
<p><a href="https://4.bp.blogspot.com/-NuCVncO9wmI/XG7FjVOAetI/AAAAAAAAOes/G6hN1eRupnEAxrfgdeccjvrOz0G9oyCKgCEwYBhgL/s1600/download%2B%25281%2529.png" imageanchor="1"><img data-original-height="252" data-original-width="383" height="262" src="https://4.bp.blogspot.com/-NuCVncO9wmI/XG7FjVOAetI/AAAAAAAAOes/G6hN1eRupnEAxrfgdeccjvrOz0G9oyCKgCEwYBhgL/s400/download%2B%25281%2529.png" width="400"></a></p>
<p><a href="https://4.bp.blogspot.com/-NuCVncO9wmI/XG7FjVOAetI/AAAAAAAAOes/G6hN1eRupnEAxrfgdeccjvrOz0G9oyCKgCEwYBhgL/s1600/download%2B%25281%2529.png" imageanchor="1"><img width="400" data-original-height="252" data-original-width="383" height="262" src="https://4.bp.blogspot.com/-NuCVncO9wmI/XG7FjVOAetI/AAAAAAAAOes/G6hN1eRupnEAxrfgdeccjvrOz0G9oyCKgCEwYBhgL/s400/download%2B%25281%2529.png"></a></p>
<br>
<p>
This result is nothing to write home about, but in just a moment we’ll re-use a lot of these functions to implement MAML.</p>
Expand All @@ -158,7 +158,7 @@ <h3>
<p>
Now let’s extend our sinusoid regression task to a multi-task problem, in which the sinusoid function can have varying phases and amplitudes. This task was proposed in the MAML paper as a way to illustrate how MAML works on a toy problem. Below are some points sampled from two different tasks, divided into “train” (used to compute the inner loss) and “validation” splits (sampled from the same task, used to compute the outer loss).</p></div>

<p><a href="https://3.bp.blogspot.com/-_juVWfK0Uj0/XG7FjZzbFnI/AAAAAAAAOe4/LG9MdTEaincGPjlS4p6lqdqP-AWmiihsACEwYBhgL/s1600/download%2B%25282%2529.png" imageanchor="1"><img data-original-height="252" data-original-width="383" height="262" src="https://3.bp.blogspot.com/-_juVWfK0Uj0/XG7FjZzbFnI/AAAAAAAAOe4/LG9MdTEaincGPjlS4p6lqdqP-AWmiihsACEwYBhgL/s400/download%2B%25282%2529.png" width="400"></a></p>
<p><a href="https://3.bp.blogspot.com/-_juVWfK0Uj0/XG7FjZzbFnI/AAAAAAAAOe4/LG9MdTEaincGPjlS4p6lqdqP-AWmiihsACEwYBhgL/s1600/download%2B%25282%2529.png" imageanchor="1"><img width="400" data-original-height="252" data-original-width="383" height="262" src="https://3.bp.blogspot.com/-_juVWfK0Uj0/XG7FjZzbFnI/AAAAAAAAOe4/LG9MdTEaincGPjlS4p6lqdqP-AWmiihsACEwYBhgL/s400/download%2B%25282%2529.png"></a></p>


<p><span id="docs-internal-guid-313a0369-7fff-b1c4-535d-2c17f57be320"><br>Suppose a task loss function $\mathcal{L}$ is defined with respect to model parameters $\theta$, input features $X$, output labels $Y$. Let $x_1, y_1$ and $x_2, y_2$ be identically distributed task instance data sampled from $X, Y$. Then MAML optimizes the following:<p>$\mathcal{L}(\theta - \nabla \mathcal{L}(\theta, x_1, y_1), x_2, y_2)$</p><p>MAML’s inner update operator is just gradient descent on the regression loss. The outer loss, <span>maml_loss</span>, is simply the original loss applied <i>after</i> the inner_update operator has been applied. One interpretation of the MAML objective is that it is a differentiable estimate of a cross-validation loss with respect to a learner. Meta-training results in an <span>inner_update</span> that minimizes the cross-validation loss.</p></span></p>
Expand All @@ -183,7 +183,7 @@ <h3>
</td></tr>
</tbody></table>
</div>
<p><a href="https://3.bp.blogspot.com/-_B1gjozIcYk/XG7FjFCW-wI/AAAAAAAAOe8/MifBFI8wAtIKsoxOYUv6Lt23zk6rFyGbACEwYBhgL/s1600/download%2B%25283%2529.png" imageanchor="1"><img data-original-height="252" data-original-width="384" height="262" src="https://3.bp.blogspot.com/-_B1gjozIcYk/XG7FjFCW-wI/AAAAAAAAOe8/MifBFI8wAtIKsoxOYUv6Lt23zk6rFyGbACEwYBhgL/s400/download%2B%25283%2529.png" width="400"></a></p>
<p><a href="https://3.bp.blogspot.com/-_B1gjozIcYk/XG7FjFCW-wI/AAAAAAAAOe8/MifBFI8wAtIKsoxOYUv6Lt23zk6rFyGbACEwYBhgL/s1600/download%2B%25283%2529.png" imageanchor="1"><img width="400" data-original-height="252" data-original-width="384" height="262" src="https://3.bp.blogspot.com/-_B1gjozIcYk/XG7FjFCW-wI/AAAAAAAAOe8/MifBFI8wAtIKsoxOYUv6Lt23zk6rFyGbACEwYBhgL/s400/download%2B%25283%2529.png"></a></p>

<p>
At meta-training time, the network learns to “quickly adapt” to x1, y1 in order to minimize cross-validation error on a new set of points x2. At deployment time (shown in the plot above), when we have a <i>new</i> task (new amplitude and phase not seen at training time), the model can apply the <span>inner_update</span> operator to fit the target sinusoid much faster and with fewer data samples than simply re-training the parameters with SGD.<br>
Expand Down Expand Up @@ -235,7 +235,7 @@ <h3>
When we plot the MAML objective as a function of training step, we see that the batched MAML trains much faster (as a function of gradient steps) and also has lower variance during training.</p></div>
<div>
<br>
<p><a href="https://2.bp.blogspot.com/-B_XbZrb3x-Y/XG7Fj-YtcNI/AAAAAAAAOe8/u8TtEfoZFF0kIa--Zlm23kFrtbjZWQOKwCEwYBhgL/s1600/download%2B%25284%2529.png" imageanchor="1"><img data-original-height="252" data-original-width="381" height="263" src="https://2.bp.blogspot.com/-B_XbZrb3x-Y/XG7Fj-YtcNI/AAAAAAAAOe8/u8TtEfoZFF0kIa--Zlm23kFrtbjZWQOKwCEwYBhgL/s400/download%2B%25284%2529.png" width="400"></a></p>
<p><a href="https://2.bp.blogspot.com/-B_XbZrb3x-Y/XG7Fj-YtcNI/AAAAAAAAOe8/u8TtEfoZFF0kIa--Zlm23kFrtbjZWQOKwCEwYBhgL/s1600/download%2B%25284%2529.png" imageanchor="1"><img width="400" data-original-height="252" data-original-width="381" height="263" src="https://2.bp.blogspot.com/-B_XbZrb3x-Y/XG7Fj-YtcNI/AAAAAAAAOe8/u8TtEfoZFF0kIa--Zlm23kFrtbjZWQOKwCEwYBhgL/s400/download%2B%25284%2529.png"></a></p>

<br>
<h3>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -591,7 +591,7 @@
</ul>
</div>
<p>Running this test gives us much more confidence in the correctness of the implementation!
Full source, including the code to restore the answer, can be found at <a class="bare" href="https://github.com/matklad/dlx">https://github.com/matklad/dlx</a>.</p>
Full source, including the code to restore the answer, can be found at <a href="https://github.com/matklad/dlx" class="bare">https://github.com/matklad/dlx</a>.</p>
<p>It’s also interesting to reflect on the unusual effectiveness of linked list for this problem.
Remember that on the modern hardware, a <code>Vec</code> beats <code>LinkedList</code> for the overwhelming majority of the problems.
While linked lists have a better theoretical complexity for the insertion and removal from the middle, most benchmarks are dominated by the traversal time to get to this middle.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2020/06/abtracetogether.png?h=288&amp;strip=all&amp;quality=80 3x" width="96"> </picture> <h3 class="more-topic__item-text"> <span class="more-topic__item-text-clamp"> Opt in or opt out? Officials face difficult ethical decision over COVID-19 contact tracing apps </span> </h3> </a> </li> <li class="more-topic__item" data-carousel-item=""> <a href="https://nationalpost.com/opinion/there-is-a-way-to-track-coronavirus-through-peoples-cellphones-and-protect-their-privacy/wcm/ff29bd9f-1451-4221-b5be-0a2a2c4e430d"> <picture class="more-topic__item-image"> <img alt="Telecommunications providers in Israel and Taiwan are working with their respective governments to provide cellular location data to track their citizens’ movements, monitor their exposure to COVID-19 and maintain compliance with social-distancing directives." class="lazyload" data-src="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2020/04/phone.jpg?h=96&amp;strip=all&amp;quality=80" height="96" loading="lazy" src="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2020/04/phone.jpg?h=96&amp;strip=all&amp;quality=5" srcset="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2020/04/phone.jpg?h=96&amp;strip=all&amp;quality=80,
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2020/04/phone.jpg?h=192&amp;strip=all&amp;quality=80 2x,
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2020/04/phone.jpg?h=288&amp;strip=all&amp;quality=80 3x" width="96"> </picture> <h3 class="more-topic__item-text"> <span class="more-topic__item-text-clamp"> Opinion: There is a way to track coronavirus through people’s data and protect their privacy </span> </h3> </a> </li> </ol> </section> <p>The Agency is planning to track population movement for roughly the next five years, including to address other public health issues, such as “other infectious diseases, chronic disease prevention and mental health,” the spokesperson added.</p> <p>Privacy advocates raised concerns to the National Post about the long-term implications of the program.</p> <p>“I think that the Canadian public will find out about many other such unauthorized surveillance initiatives before the pandemic is over—and afterwards,” David Lyon, author of Pandemic Surveillance and former director of the Surveillance Studies Centre at Queen’s University, said in an email.</p> </section> <div class="ad__section-border article-content__ad-group"> <section aria-describedby="advertisment2756889159173589331819438340447713" class="ad"> </section> </div> <section class="article-content__content-group"> <p>Lyon warned that PHAC “uses the same kinds of ‘reassuring’ language as national security agencies use, for instance not mentioning possibilities for re-identifying data that has been ‘de-identified.’”</p> <p>“In principle, of course, cell data can&nbsp;be used for tracking.”</p> <p>Mobility data analysis “helps to advance public health objectives,” the PHAC spokesperson said. The findings have been regularly shared with provinces and territories via the special advisory committee to “inform public health messaging, planning and policy development,” the spokesperson said.</p> <p data-async="">The data is also used for the <a data-evt="click" data-evt-typ="click" data-evt-val="{&quot;control_fields&quot;: {&quot;mparticle&quot;: {&quot;keys&quot;: {&quot;click_source_type&quot;: &quot;click_source_type&quot;, &quot;anchor_text&quot;: &quot;anchor_text&quot;, &quot;target_url&quot;: &quot;target_url&quot;}, &quot;mp_event_type&quot;: &quot;Navigation&quot;, &quot;extra_keys&quot;: [&quot;click_vertical_position_percentage&quot;, &quot;click_vertical_position_pixels&quot;]}}, &quot;click_source_type&quot;: &quot;in-page link&quot;, &quot;anchor_text&quot;: &quot;COVID Trends&quot;, &quot;target_url&quot;: &quot;https://health-infobase.canada.ca/covid-19/covidtrends/&quot;}" href="https://health-infobase.canada.ca/covid-19/covidtrends/" rel="noopener noreferrer" target="_blank">COVID Trends</a> portal, a dashboard that provides a summarized data of movement trends.</p> <p>Lyon urged a need for greater information “regarding exactly what was done, what was achieved and whether or not it truly served the interests of Canadian citizens.”</p> </section> <div class="ad__section-border article-content__ad-group"> <section aria-describedby="advertisment7092511175645504205384746069692436" class="ad"> </section> </div> <section class="article-content__content-group"> <figure class="embedded-image"> <picture class="embedded-image__ratio"> <img alt="Privacy advocates say public health monitoring jeopradizes user privacy." class="embedded-image__image lazyload" data-src="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=288" data-srcset="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=288,
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=576 2x" height="750" loading="lazy" src="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=5&amp;strip=all&amp;w=100" srcset="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=288,
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=576 2x" width="1000"> </picture> <figcaption class="image-caption"> <span class="caption"> Privacy advocates say public health monitoring jeopradizes user privacy.</span> <span class="distributor">File</span> </figcaption> </figure> <p>Deploying surveillance tools for public health purposes also raises to the issue of equity, Martin French, an associate professor of Concordia University focusing on surveillance, privacy and social justice, noted in an email.</p> <p>“There are populations that could experience an intensification of tracking that could have harmful (rather than beneficial) repercussions.”</p> <p>Increased use of surveillance technology during the COVID-19 pandemic has created a new normal in the name of security, Lyon said.</p> <p>“The pandemic has created opportunities for a massive surveillance surge on many levels—not only for public health, but also for monitoring those working, shopping and learning from home.”</p> <p>“Evidence is coming in from many sources, from countries around the world, that what was seen as a huge surveillance surge—post 9/11—is now completely upstaged by pandemic surveillance,” he added.</p> </section> <div class="ad__section-border article-content__ad-group"> <section aria-describedby="advertisment2176600709457159692485038518947555" class="ad"> </section> </div> <section class="article-content__content-group"> <p>In a notice posted earlier this week, the agency called for contractors with access to “cell-tower/operator location data in the response to the COVID-19 pandemic and for other public health applications.” It asks for “de-identified cell-tower based location data from across Canada” beginning from from Jan. 2019 until the end of the contract period on May 31, 2023, with possibility of three one-year extensions.</p> <p>The contractor must provide anonymized data to PHAC and ensure its users have the ability to easily opt-out of mobility data sharing programs, the agency says.</p> <p>PHAC’s privacy management division conducted an assessment and “determined that since no personal information is being acquired through this contract, there are no concerns under the Privacy Act,” the spokesperson said.</p> <p>The Office of the Privacy Commissioner said it is “following up with PHAC to obtain more information about the proposed initiative” and could not provide additional comment at this time.</p> </section> <section class="article-content__share-group article-delimiter" data-evt="beforeunload" data-evt-typ="page_scroll" data-evt-val="{&quot;control_fields&quot;: {&quot;mparticle&quot;: {&quot;mp_custom_flags&quot;: [&quot;Google.NonInteraction&quot;, &quot;Google.Page&quot;], &quot;extra_keys&quot;: [&quot;percentage_of_page_viewed&quot;, &quot;percentage_of_story_viewed&quot;]}}}"> </section> </article> </div></article></main></body>
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=576 2x" height="750" loading="lazy" src="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=5&amp;strip=all&amp;w=100" width="1000" srcset="https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=288,
https://smartcdn.gprod.postmedia.digital/nationalpost/wp-content/uploads/2021/12/pjt-hand-mural-big-tech-watching-4_80865302-w.jpg?quality=90&amp;strip=all&amp;w=576 2x"> </picture> <figcaption class="image-caption"> <span class="caption"> Privacy advocates say public health monitoring jeopradizes user privacy.</span> <span class="distributor">File</span> </figcaption> </figure> <p>Deploying surveillance tools for public health purposes also raises to the issue of equity, Martin French, an associate professor of Concordia University focusing on surveillance, privacy and social justice, noted in an email.</p> <p>“There are populations that could experience an intensification of tracking that could have harmful (rather than beneficial) repercussions.”</p> <p>Increased use of surveillance technology during the COVID-19 pandemic has created a new normal in the name of security, Lyon said.</p> <p>“The pandemic has created opportunities for a massive surveillance surge on many levels—not only for public health, but also for monitoring those working, shopping and learning from home.”</p> <p>“Evidence is coming in from many sources, from countries around the world, that what was seen as a huge surveillance surge—post 9/11—is now completely upstaged by pandemic surveillance,” he added.</p> </section> <div class="ad__section-border article-content__ad-group"> <section aria-describedby="advertisment2176600709457159692485038518947555" class="ad"> </section> </div> <section class="article-content__content-group"> <p>In a notice posted earlier this week, the agency called for contractors with access to “cell-tower/operator location data in the response to the COVID-19 pandemic and for other public health applications.” It asks for “de-identified cell-tower based location data from across Canada” beginning from from Jan. 2019 until the end of the contract period on May 31, 2023, with possibility of three one-year extensions.</p> <p>The contractor must provide anonymized data to PHAC and ensure its users have the ability to easily opt-out of mobility data sharing programs, the agency says.</p> <p>PHAC’s privacy management division conducted an assessment and “determined that since no personal information is being acquired through this contract, there are no concerns under the Privacy Act,” the spokesperson said.</p> <p>The Office of the Privacy Commissioner said it is “following up with PHAC to obtain more information about the proposed initiative” and could not provide additional comment at this time.</p> </section> <section class="article-content__share-group article-delimiter" data-evt="beforeunload" data-evt-typ="page_scroll" data-evt-val="{&quot;control_fields&quot;: {&quot;mparticle&quot;: {&quot;mp_custom_flags&quot;: [&quot;Google.NonInteraction&quot;, &quot;Google.Page&quot;], &quot;extra_keys&quot;: [&quot;percentage_of_page_viewed&quot;, &quot;percentage_of_story_viewed&quot;]}}}"> </section> </article> </div></article></main></body>
Loading

0 comments on commit 760ab8f

Please sign in to comment.