Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

page.getCategories() method returns hidden categories #77

Open
daxenberger opened this issue Jul 31, 2015 · 2 comments
Open

page.getCategories() method returns hidden categories #77

daxenberger opened this issue Jul 31, 2015 · 2 comments

Comments

@daxenberger
Copy link
Member

Originally reported on Google Code with ID 83

Bug was originally reported here: http://groups.google.com/group/jwpl/t/97a005ede47ee60f

---

I noticed that the page.getCategories() method returns a lot of
categories that are nor visible on the page.

For example the categories of the article "Germany" include:

Wikipedia indefinitely move-protected pages
All articles containing potentially dated statements
Articles containing German language text
Articles containing potentially dated statements from 2008
Wikipedia semi-protected pages
Articles containing potentially dated statements from November 2009
Articles with dead external links from September 2010
All articles with dead external links
Featured articles
Articles with dead external links from June 2010

All these categories seem rather useless to me. I think it would be
nice if there where a method in the page class that would return only
visible categories instead of all categories.

I also noticed that all these categories share a the parent category
"Hidden categories".
That makes it easy to filter them from the result set. I added this
method in the page class to my local version of jwpl api:

public Set<Category> getVisibleCategories()
{
        Session session = this.wiki.__getHibernateSession();
        session.beginTransaction();
        session.lock(hibernatePage, LockMode.NONE);
        Set<Integer> tmp = new
UnmodifiableArraySet<Integer>(hibernatePage.getCategories());
        session.getTransaction().commit();
        Set<Category> allCategories = new HashSet<Category>();
        for (int pageID : tmp) {
                allCategories.add(wiki.getCategory(pageID));
        }
        Set<Category> result = new HashSet<Category>();
        for(Category category: allCategories)
        {
                Set<Integer> parentIds = category.getParentIDs();
                if(!parentIds.contains(15961454))
                {
                        result.add(category);
                }
        }
        return result;
}

This solution is bad because i hard-coded the pageId of "Hidden
Categories" in the method but maybe you could include a similar and
better method  in the next release of jwpl?



Reported by oliver.ferschke on 2012-03-01 01:26:01

@daxenberger
Copy link
Member Author

Valid argument. We should do something about that.

Reported by oliver.ferschke on 2012-03-01 01:26:32

@daxenberger
Copy link
Member Author

Issue 122 has been merged into this issue.

Reported by torsten.zesch on 2013-12-09 09:47:26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants