Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple of questions about concepts. #3

Open
wujy2015 opened this issue Jan 16, 2018 · 5 comments
Open

Couple of questions about concepts. #3

wujy2015 opened this issue Jan 16, 2018 · 5 comments

Comments

@wujy2015
Copy link

Hi,

Does the PRF use abstract or full paper? How many documents you use for this? Which ranking functions do you use for getting PRF documents? Did you include other concepts into first retrieval?

From my experiment, I can extract more concept than I see from your running files. I am wondering if you select only top frequency concept?

@balaneshin
Copy link
Contributor

balaneshin commented Jan 17, 2018

Hi @wujy2015
We used full paper. From 28 top-ranked documents, we chose 30 top-ranked concepts. For document retrieval, we used the previously extracted concepts and run the query by using the two-stage method proposed in here and described in here.

As I mentioned earlier, we chose only top-ranked concepts by using PRF method.

@wujy2015
Copy link
Author

Does that mean you use #rm from indri to get 28 top doc and then use UMLS to get 30 top-ranked concepts, and then merge the previous concepts ? How do you rank the concepts?

@balaneshin
Copy link
Contributor

For PRF concepts, please see this link regarding how to compute scores for the concepts and how top 30 of them are extracted. For this concept type, we did not use UMLS to rank them.

@wujy2015
Copy link
Author

So you use RM to calculate top 30 frequency words from top 28 ranked documents? And take that as concept or input those words into UMLS?

@balaneshin
Copy link
Contributor

@wujy2015 As can be seen from Table 2 of the paper, we have multiple concept types. In this work, concepts from top-ranked documents (described in Line 7 of Table 2) are extracted independently from UMLS concepts (described in lines 2, 3, 5, and 6 of Table 2). These UMLS concepts are only extracted from queries (topic summary and topic description). You can see the topic summaries and topic descriptions from here. Therefore, we did not input the concepts from top-ranked documents as inputs into UMLS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants