Skip to content

Commit

Permalink
import: data frames are here!
Browse files Browse the repository at this point in the history
Implemented a whole new class to represent the data that comes in from CSV and XLSX. See docs for more info.

Closes #153
  • Loading branch information
andymeneely committed Dec 2, 2016
1 parent 935040b commit f4d9424
Show file tree
Hide file tree
Showing 9 changed files with 484 additions and 30 deletions.
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ Squib follows [semantic versioning](http://semver.org).
Features:
* `save_pdf` now supports crop marks! These are lines drawn in the margins of a PDF file to help you cut. These can be enabled by setting `crop_marks: true` in your `save_pdf` call. Can be further customized with `crop_margin_bottom`, `crop_margin_left`, `crop_margin_right`, `crop_margin_top`, `crop_marks`, `crop_stroke_color`, `crop_stroke_dash`, and `crop_stroke_width` (#123)
* `Squib.configure` allows you to set options programmatically, overriding your config.yml. This is useful for Rakefiles, and will be documented in my upcoming tutorial on workflows.
* `Squib.enable_build_globally` and `Squib.disable_build_globally` are new convenience methods for working with the `SQUIB_BUILD` environment variable. Handy for Rakefiles and Guard sessions for turning certain builds on an off. Also will be in upcoming workflow tutorial.
* `Squib.enable_build_globally` and `Squib.disable_build_globally` are new convenience methods for working with the `SQUIB_BUILD` environment variable. Handy for Rakefiles and Guard sessions for turning certain builds on an off. Also will be documented in upcoming workflow tutorial.
* The import methods `csv` and `xlsx` now return `Squib::DataFrame`, which behaves exactly as before - but has more cool features like being able to do `data.name` instead of `data['name']`. Also: check out `data.to_pretty_text`. Check out the docs. (#156)

Bugs:
* `showcase` works as expected when using `backend: svg` (#179)
Expand Down
4 changes: 3 additions & 1 deletion docs/build_groups.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,9 @@ One adaptation of this is to do the environment setting in a ``Rakefile``. `Rake
:language: ruby
:linenos:

Thus, you can just run this code on the command line like these::
Thus, you can just run this code on the command line like these:

.. code-block:: none
$ rake
$ rake pnp
Expand Down
12 changes: 8 additions & 4 deletions docs/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,22 @@ Be Data-Driven with XLSX and CSV

Squib supports importing data from ExcelX (.xlsx) files and Comma-Separated Values (.csv) files. Because :doc:`/arrays`, these methods are column-based, which means that they assume you have a header row in your table, and that header row will define the name of the column.

Hash of Arrays
--------------
Squib::DataFrame, or a Hash of Arrays
-------------------------------------

In both DSL methods, Squib will return a ``Hash`` of ``Arrays`` correspoding to each row. Thus, be sure to structure your data like this:
In both DSL methods, Squib will return a "data frame" (literally of type ``Squib::DataFrame``). The best way to think of this is a ``Hash`` of ``Arrays``, where each column is a key in the hash, and every element of each Array represents a data point on a card.

The data import methods expect you to structure your Excel sheet or CSV like this:

* First row should be a header - preferably with concise naming since you'll reference it in Ruby code
* Rows should represent cards in the deck
* Columns represent data about cards (e.g. "Type", "Cost", or "Name")

Of course, you can always import your game data other ways using just Ruby (e.g. from a REST API, a JSON file, or your own custom format). There's nothing special about Squib's methods in how they relate to ``Squib::Deck`` other than their convenience.

See :doc:`/dsl/xlsx` and :doc:`/dsl/csv` for more details and examples.
See :doc:`/dsl/xlsx` and :doc:`/dsl/csv` for more details and examples on how the data can be imported.

The ``Squib::DataFrame`` class provides much more than what a ``Hash`` provides, however. The :doc:`/dsl/data_frame`

Quantity Explosion
------------------
Expand Down
85 changes: 85 additions & 0 deletions docs/dsl/data_frame.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
Squib::DataFrame
================

As described in :doc:`/data`, the ``Squib::DataFrame`` is what is returned by Squib's data import methods (:doc:`/dsl/csv` and :doc:`/dsl/xlsx`).

It behaves like a ``Hash`` of ``Arrays``, so acessing an individual column can be done via the square brackets, e.g. ``data['title']``.

Here are some other convenience methods in ``Squib::DataFrame``

columns become methods
----------------------

Through magic of Ruby metaprogramming, every column also becomes a method on the data frame. So these two are equivalent:

.. code-block:: irb
irb(main):002:0> data = Squib.csv file: 'basic.csv'
=> #<Squib::DataFrame:0x00000003764550 @hash={"h1"=>[1, 3], "h2"=>[2, 4]}>
irb(main):003:0> data.h1
=> [1, 3]
irb(main):004:0> data['h1']
=> [1, 3]
#columns
--------

Returns an array of the column names in the data frame

#ncolumns
---------

Returns the number of columns in the data frame

#col?(name)
-----------

Returns ``true`` if there is column ``name``.

#row(i)
-------

Returns a hash of values across all columns in the ``i``th row of the dataframe. Represents a single card.
#nrows
------
Returns the number of rows the data frame has, computed by the maximum length of any column array.
#to_json
--------
Returns a ``json`` representation of the entire data frame.

#to_pretty_json
---------------

Returns a ``json`` representation of the entire data frame, formatted with indentation for human viewing.

#to_pretty_text
---------------

Returns a textual representation of the dataframe that emulates what the information looks like on an individual card. Here's an example:

.. code-block:: text
╭------------------------------------╮
Name | Mage |
Cost | 1 |
Description | You may cast 1 spell per turn |
Snark | Magic, dude. |
╰------------------------------------╯
╭------------------------------------╮
Name | Rogue |
Cost | 2 |
Description | You always take the first turn. |
Snark | I like to be sneaky |
╰------------------------------------╯
╭------------------------------------╮
Name | Warrior |
Cost | 3 |
Description |
Snark | I have a long story to tell to tes |
| t the word-wrapping ability of pre |
| tty text formatting. |
╰------------------------------------╯
4 changes: 3 additions & 1 deletion docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@ Squib works with both x86 and x86_64 versions of Ruby.
Typical Install
---------------

Regardless of your OS, installation is::
Regardless of your OS, installation is

.. code-block:: none
$ gem install squib
Expand Down
11 changes: 6 additions & 5 deletions lib/squib/api/data.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
require_relative '../args/input_file'
require_relative '../args/import'
require_relative '../args/csv_opts'
require_relative '../import/data_frame'

module Squib

Expand All @@ -12,7 +13,7 @@ def xlsx(opts = {})
import = Args::Import.new.load!(opts)
s = Roo::Excelx.new(input.file[0])
s.default_sheet = s.sheets[input.sheet[0]]
data = {}
data = Squib::DataFrame.new
s.first_column.upto(s.last_column) do |col|
header = s.cell(s.first_row, col).to_s
header.strip! if import.strip?
Expand All @@ -39,14 +40,14 @@ def csv(opts = {})
csv_opts = Args::CSV_Opts.new(opts)
table = CSV.parse(data, csv_opts.to_hash)
check_duplicate_csv_headers(table)
hash = Hash.new
hash = Squib::DataFrame.new
table.headers.each do |header|
new_header = header.to_s
new_header = new_header.strip if import.strip?
hash[new_header] ||= table[header]
end
if import.strip?
new_hash = Hash.new
new_hash = Squib::DataFrame.new
hash.each do |header, col|
new_hash[header] = col.map do |str|
str = str.strip if str.respond_to?(:strip)
Expand Down Expand Up @@ -78,9 +79,9 @@ def check_duplicate_csv_headers(table)

# @api private
def explode_quantities(data, qty)
return data unless data.key? qty.to_s.strip
return data unless data.col? qty.to_s.strip
qtys = data[qty]
new_data = {}
new_data = Squib::DataFrame.new
data.each do |col, arr|
new_data[col] = []
qtys.each_with_index do |qty, index|
Expand Down
108 changes: 108 additions & 0 deletions lib/squib/import/data_frame.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# encoding: UTF-8

require 'json'
require 'forwardable'

module Squib
class DataFrame
include Enumerable

def initialize(hash = {}, def_columns = true)
@hash = hash
columns.each { |col| def_column(col) } if def_columns
end

def each(&block)
@hash.each(&block)
end

def [](i)
@hash[i]
end

def []=(col, v)
@hash[col] = v
def_column(col)
return v
end

def columns
@hash.keys
end

def ncolumns
@hash.keys.size
end

def col?(col)
@hash.key? col
end

def row(i)
@hash.inject(Hash.new) { |ret, (name, arr)| ret[name] = arr[i]; ret }
end

def nrows
@hash.inject(0) { |max, (_n, col)| col.size > max ? col.size : max }
end

def to_json
@hash.to_json
end

def to_pretty_json
JSON.pretty_generate(@hash)
end

def to_h
@hash
end

def to_pretty_text
max_col = columns.inject(0) { |max, c | c.length > max ? c.length : max }
top = " ╭#{'-' * 36}\n"
bottom = " ╰#{'-' * 36}\n"
str = ''
0.upto(nrows - 1) do | i |
str += (' ' * max_col) + top
row(i).each do |col, data|
str += "#{col.rjust(max_col)} #{wrap_n_pad(data, max_col)}"
end
str += (' ' * max_col) + bottom
end
return str
end

private

def snake_case(str)
str.to_s.
strip.
gsub(/\s+/,'_').
gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
gsub(/([a-z]+)([A-Z])/,'\1_\2').
downcase.
to_sym
end

def wrap_n_pad(str, max_col)
str.to_s.
concat(' '). # handle nil & empty strings
scan(/.{1,34}/).
map { |s| (' ' * max_col) + " | " + s.ljust(34) }.
join(" |\n").
lstrip. # initially no whitespace next to key
concat(" |\n")
end

def def_column(col)
raise "Column #{col} - does not exist" unless @hash.key? col
method_name = snake_case(col)
return if self.class.method_defined?(method_name) #warn people? or skip?
define_singleton_method method_name do
@hash[col]
end
end

end
end
Loading

0 comments on commit f4d9424

Please sign in to comment.