-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Searching references when looking for terms in the pdf files #14
Comments
If omitting the reference section is too difficult, and it could be especially since we will want to scan sections that occur after the references (tables and figures), perhaps we don't worry about it. We could make strategic decisions on how to present the key word search information. If we opt to only use a word cloud, then terms that are only found once or a few will likely not be seen. Alternatively, if we opt to have a table of key terms we could impose a lower bound on items to include that could also deal with this. I have also been wondering if we should increase the number of assessments summarized. I initially only grabbed a 2-3 from each region thinking we were going to use them to guide us to make decisions on what terms to include in our glossary, but if we want to present this as more of a robust synthesis we may want to add more assessments. What do others think? |
Good catch, Kelli! Thanks to both of you for providing the solutions
already. I will give it a try and see if I can remove the reference section
before counting keywords. If not, I agree that we could create a word
cloud/table with a lower bound on items.
Chantel, I like the idea of increasing the number of assessments
summarized, what would be the maximum number per region? I could also help
with modifying the output table so the count can be grouped by regions.
…On Wed, Mar 2, 2022 at 10:42 AM Chantel Wetzel ***@***.***> wrote:
If omitting the reference section is too difficult, and it could be
especially since we will want to scan sections that occur after the
references (tables and figures), perhaps we don't worry about it. We could
make strategic decisions on how to present the key word search information.
If we opt to only use a word cloud, then terms that are only found once or
a few will likely not be seen. Alternatively, if we opt to have a table of
key terms we could impose a lower bound on items to include that could also
deal with this.
I have also been wondering if we should increase the number of assessments
summarized. I initially only grabbed a 2-3 from each region thinking we
were going to use them to guide us to make decisions on what terms to
include in our glossary, but if we want to present this as more of a robust
synthesis we may want to add more assessments. What do others think?
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOJI36QSBS5RCDEZMEORWB3U56D4FANCNFSM5PVQXIMQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
remove the numbers of times the words were found b/c we might change the documents that are included or what portion of those documents that are searched, i.e., #14. Remove the reference to SAM Specify that initial cannot be searched for. Include rationale on why using unfished.
I went to Stock Smart and pulled additional assessment documents for all Science Centers. The number of assessments for some Science Centers were limited by those available (PIFSC and SWFSC) but I tried to grab a large selection across a range of species. I have added the following files onto the google drive folder ("Assessment Docs"): AFSC: 16 |
Thanks Chantel. Will update the text mining outputs using updated
assessment docs.
…On Thu, Mar 3, 2022 at 10:26 AM Chantel Wetzel ***@***.***> wrote:
I went to Stock Smart
<https://www.st.nmfs.noaa.gov/stocksmart?stockname=Scup%20-%20Atlantic%20Coast&stockid=10286>
and pulled additional assessment documents for all Science Centers. The
number of assessments for some Science Centers were limited by those
available (PIFSC and SWFSC) but I tried to grab a large selection across a
range of species. I have added the following files onto the google drive
folder ("Assessment Docs"):
AFSC: 16
NWFSC: 18
NEFSC: 16
PIFSC: 8
SEFSC: 22
SWFSC: 4
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOJI36WUEDFPP62MLZQ6PSTU6DK23ANCNFSM5PVQXIMQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were assigned.Message ID:
***@***.***>
|
@Bai-Li-NOAA I noticed that unfished|virgin|equilibrium comes up just once in noaa_17252_DS1.pdf so I went to the pdf to see where it occurred and it was in the reference section. Do you think that we should try to eliminate searching the entire document or just not worry about it? The trouble of eliminating the reference section is that it is often in the middle of the document. I vote for just not worrying about it and mentioning it in the Discussion section or something along those lines. But, I wanted to get other's thoughts.
The text was updated successfully, but these errors were encountered: