Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

怎么得到论文上的F1值? #12

Open
g1kyne opened this issue Dec 20, 2021 · 6 comments
Open

怎么得到论文上的F1值? #12

g1kyne opened this issue Dec 20, 2021 · 6 comments

Comments

@g1kyne
Copy link

g1kyne commented Dec 20, 2021

See running/freebase/pipeline_cwq.py if run CWQ 1.1. See running/freebase/pipeline_grapqh.py if run GraphQuestions. Below, an example on GraphQuestions.
你好,请问怎么得到论文上的F1值?是运行上述两个文件就可以吗?还是需要进行下面的一系列操作?运行步骤不是很明白

@g1kyne
Copy link
Author

g1kyne commented Dec 20, 2021

运行pipeline_grapqh.py是这样的结果:
#all_f1_score: 561.5324373390596
#count_number: 1839
请问怎么得到论文中的21呢?

@simba0626
Copy link
Collaborator

您好,感谢您的关注
561.5324/2608 = 21.53 //2608是所有测试集的问句数量

@g1kyne
Copy link
Author

g1kyne commented Dec 21, 2021

谢谢您的解答。请问我在运行另一个数据集cwq时,结果是
#module: 3_evaluation
end
,是您提供的cwq数据集的zip包缺少对应的已训练文件吗?是否需要我自己训练,从module=1.0开始吗?是否需要先载入知识图谱到virtuoso数据库?

@g1kyne
Copy link
Author

g1kyne commented Dec 21, 2021

您好,我在运行pipeline_cwq.py文件后,得到如下结果:
#all_f1_score: 930.9517663221876
#count_number: 2225
end
论文中提到34689的10分之一用于测试,即测试集中应该是3468条问句
但是930.9517/3468=26.84% ,没有达到预期的结果31%
想问一下这是什么原因?

@simba0626
Copy link
Collaborator

您好,感谢您的关注
(1) 测试集规模是3531个问句
(2) 您计算 930.95176,猜测是消融实验结果SPARQA w/o sentence-level scorer 930.95176/3531=26.36 (但是不能确定)
(3) 真实all_f1_score应该在1111多一点 1111/3531=31.46
您试试evaluation/kbcqa_evaluation.py中的两行切换一下,score 或 total_score试试看看
# score_to_queryid_sparql[grounded_graph.score].append(grounded_graph.grounded_query_id) #word level matcher
score_to_queryid_sparql[grounded_graph.total_score].append(grounded_graph.grounded_query_id)

thanks

@g1kyne
Copy link
Author

g1kyne commented Dec 27, 2021

已经得到了解决,非常感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants