Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: index 1 is out of bounds for axis 0 with size 1 #9

Open
Rogerspy opened this issue May 26, 2021 · 5 comments
Open

IndexError: index 1 is out of bounds for axis 0 with size 1 #9

Rogerspy opened this issue May 26, 2021 · 5 comments

Comments

@Rogerspy
Copy link

你好,我想问下,输入文件的格式是怎样的?我运行的时候出现以下bug,我猜测应该是输入特征的问题导致本来应该是二维输出最后变成了一维的。我的输入文件就是每行一条文本无空格,比如:

我是一个人。
哈哈哈哈哈。
那里有个苹果。
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-3-097136691c2f> in <module>
      6         LoggingCallback(),
      7         ConstantThresholdScheduler(),
----> 8         EarlyStopping(patience=2, min_delta=3)
      9     ])

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in mine(self, corpus_files, quality_phrase_files, N, epochs, callbacks, topk, filter_fn, **kwargs)
    122 
    123             callback.on_epoch_reorganize_phrase_pools_begin(epoch, pos_pool, neg_pool)
--> 124             pos_pool, neg_pool = self._reorganize_phrase_pools(pos_pool, neg_pool, **kwargs)
    125             callback.on_epoch_reorganize_phrase_pools_end(epoch, pos_pool, neg_pool)
    126 

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in _reorganize_phrase_pools(self, pos_pool, neg_pool, **kwargs)
    157         new_pos_pool.extend(deepcopy(pos_pool))
    158 
--> 159         pairs = self._predict_proba(neg_pool)
    160         pairs = sorted(pairs, key=lambda x: x[1], reverse=True)
    161         # print(pairs[:10])

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in _predict_proba(self, phrases)
    184     def _predict_proba(self, phrases):
    185         features = [self._compose_feature(phrase) for phrase in phrases]
--> 186         pos_probs = [prob[1] for prob in self.classifier.predict_proba(features)]
    187         pairs = [(phrase, prob) for phrase, prob in zip(phrases, pos_probs)]
    188         return pairs

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in <listcomp>(.0)
    184     def _predict_proba(self, phrases):
    185         features = [self._compose_feature(phrase) for phrase in phrases]
--> 186         pos_probs = [prob[1] for prob in self.classifier.predict_proba(features)]
    187         pairs = [(phrase, prob) for phrase, prob in zip(phrases, pos_probs)]
    188         return pairs

IndexError: index 1 is out of bounds for axis 0 with size 1
@transformerzhou
Copy link

你好,我想问下,输入文件的格式是怎样的?我运行的时候出现以下bug,我猜测应该是输入特征的问题导致本来应该是二维输出最后变成了一维的。我的输入文件就是每行一条文本无空格,比如:

我是一个人。
哈哈哈哈哈。
那里有个苹果。
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-3-097136691c2f> in <module>
      6         LoggingCallback(),
      7         ConstantThresholdScheduler(),
----> 8         EarlyStopping(patience=2, min_delta=3)
      9     ])

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in mine(self, corpus_files, quality_phrase_files, N, epochs, callbacks, topk, filter_fn, **kwargs)
    122 
    123             callback.on_epoch_reorganize_phrase_pools_begin(epoch, pos_pool, neg_pool)
--> 124             pos_pool, neg_pool = self._reorganize_phrase_pools(pos_pool, neg_pool, **kwargs)
    125             callback.on_epoch_reorganize_phrase_pools_end(epoch, pos_pool, neg_pool)
    126 

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in _reorganize_phrase_pools(self, pos_pool, neg_pool, **kwargs)
    157         new_pos_pool.extend(deepcopy(pos_pool))
    158 
--> 159         pairs = self._predict_proba(neg_pool)
    160         pairs = sorted(pairs, key=lambda x: x[1], reverse=True)
    161         # print(pairs[:10])

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in _predict_proba(self, phrases)
    184     def _predict_proba(self, phrases):
    185         features = [self._compose_feature(phrase) for phrase in phrases]
--> 186         pos_probs = [prob[1] for prob in self.classifier.predict_proba(features)]
    187         pairs = [(phrase, prob) for phrase, prob in zip(phrases, pos_probs)]
    188         return pairs

/usr/local/lib/python3.6/dist-packages/autophrasex/autophrase.py in <listcomp>(.0)
    184     def _predict_proba(self, phrases):
    185         features = [self._compose_feature(phrase) for phrase in phrases]
--> 186         pos_probs = [prob[1] for prob in self.classifier.predict_proba(features)]
    187         pairs = [(phrase, prob) for phrase, prob in zip(phrases, pos_probs)]
    188         return pairs

IndexError: index 1 is out of bounds for axis 0 with size 1

你好,我也出现了这个问题,请问你解决了吗

@AQA6666
Copy link

AQA6666 commented Jul 28, 2021

我也遇到了这个问题,现在在看reader部分是不是有什么问题

@transformerzhou
Copy link

我也遇到了这个问题,现在在看reader部分是不是有什么问题

好像是输入的预料里面没有匹配到任何的quality phrase, 导致训练随机森林的时候,类别长度只有1,最终pos_probs = [prob[1] for prob in self.classifier.predict_proba(features)]这个里面的prob的长度也只有1,prob[1]就越界了

@luozhouyang
Copy link
Owner

@transformerzhou 对的,代码目前没有对这个进行判断,所以会有可能触发这个报错。有兴趣的话可以提交一个PR,感谢~

@Lishumuzixin
Copy link

请问问题解决了么?怎么解决的呀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants