Skip to content

brettsschmidt/crispy-pancake

Repository files navigation

crispy-pancake

Source code and paper for An Exploration of Text Generation project

Search Engine Optimization (SEO) is a subject littered with contradicting methodologies due to the inherent and necessary secrecy of Google’s indexing algorithm where methodologies can only be proven by results and are subject to change at Google’s discretion. One current and popular SEO methodology that has survived these constraints is to generate website content to better search result ranking. Among other factors, building a website’s content to a content-size of the likes of goliath websites like Reddit, StackOverflow, or Facebook helps a website grow their SEO value. Purpose-less content that offers no value to readers would be contrary of what Google would like to offer their users and to prevent this have created their Penguin Algorithm to prohibit content spamming. An example of purpose-less content that the Penguin Algorithm seeks to prohibit is that made from Artificial Intelligence techniques for the sole purposes of targeting key-words. This research looks to attempt generating fictional content that can pass a Turing Test (or the Penguin Algorithm) given current Natural Language Processing techniques, a novice, yet spanning skill set, and time constraints. The specific content generation task that will be attempted will be to create fictional, Choose-Your-Own-Adventure style stories. The input of a model will be a collection of books by H. G. Wells from Project Gutenberg (n.d.). The output of a model will be XML formatted data depicting the decision-branching of the story (Figure 26). An additional of this project is to create various avenues to port these stories to such as a WinForm app or a static HTML webpage. Three different examples of language abstraction for NLP purposes will be evaluated by this research: character-level, word-level, and sentence-level. All the examples are vector-representations of their given element through various means like one-hot encoding or word-to-vec encoding, but all reduce to the same basic problem: how to get another structure to fit the task. Where a task may be to translate French to English or to create Choose-Your-Own-Adventure stories like this research’s task. In the case of building Choose-Your-Own-Adventure style stories, given a vector-representation the model should give the next structure to build to a coherent sentence. For example, given a character the model should give the character, given a word the model should give the next word, or given a sentence the model should give the next sentence. These examples will be further evaluated and discussed for the context of the given task.
The specific models that will be attempted to be built are the Transformer, Bi-directional LSTM, and LSTM, while Skip-thought Vectors and Generative Adversarial Networks will be explored.

About

Crispy-pancake and soggy-hashbrown AIs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published