debian-tablib/README.rst

149 lines
2.8 KiB
ReStructuredText
Raw Normal View History

2010-09-12 17:45:31 +02:00
Tablib: format-agnostic tabular dataset library
2010-08-29 03:06:30 +02:00
===============================================
2010-08-29 03:03:00 +02:00
::
2010-08-29 03:06:30 +02:00
2011-04-05 14:56:20 +02:00
_____ ______ ___________ ______
__ /_______ ____ /_ ___ /___(_)___ /_
2010-09-12 17:45:31 +02:00
_ __/_ __ `/__ __ \__ / __ / __ __ \
/ /_ / /_/ / _ /_/ /_ / _ / _ /_/ /
\__/ \__,_/ /_.___/ /_/ /_/ /_.___/
2010-08-29 03:06:30 +02:00
2010-07-12 22:58:25 +02:00
2011-04-05 14:56:20 +02:00
Tablib is a format-agnostic tabular dataset library, written in Python.
2010-07-13 14:47:34 +02:00
2010-09-12 19:48:48 +02:00
Output formats supported:
2010-07-12 22:19:29 +02:00
2010-09-25 22:46:52 +02:00
- Excel (Sets + Books)
- JSON (Sets + Books)
- YAML (Sets + Books)
2011-01-31 06:58:16 +01:00
- HTML (Sets)
2010-11-04 06:07:04 +01:00
- TSV (Sets)
2010-09-25 22:46:52 +02:00
- CSV (Sets)
2011-02-14 03:11:02 +01:00
Note that tablib *purposefully* excludes XML support. It always will. (Note: This is a joke. Pull requests are welcome.)
2010-07-12 22:19:29 +02:00
2010-09-25 23:12:50 +02:00
Overview
--------
`tablib.Dataset()`
2011-05-23 02:10:14 +02:00
A Dataset is a table of tabular data. It may or may not have a header row. They can be build and manipulated as raw Python datatypes (Lists of tuples|dictionaries). Datasets can be imported from JSON, YAML, and CSV; they can be exported to XLSX, XLS, ODS, JSON, YAML, CSV, TSV, and HTML.
2011-04-05 14:56:20 +02:00
2010-09-25 23:12:50 +02:00
`tablib.Databook()`
2011-05-23 02:10:14 +02:00
A Databook is a set of Datasets. The most common form of a Databook is an Excel file with multiple spreadsheets. Databooks can be imported from JSON and YAML; they can be exported to XLSX, XLS, ODS, JSON, and YAML.
2010-07-12 22:19:29 +02:00
2010-09-12 19:48:48 +02:00
Usage
-----
2010-07-12 22:19:29 +02:00
2011-04-05 14:56:20 +02:00
2010-08-29 03:03:00 +02:00
Populate fresh data files: ::
2011-04-05 14:56:20 +02:00
2010-09-14 06:01:59 +02:00
headers = ('first_name', 'last_name')
2010-07-12 22:19:29 +02:00
2010-08-29 03:06:30 +02:00
data = [
2010-09-14 06:01:59 +02:00
('John', 'Adams'),
('George', 'Washington')
2010-08-29 03:06:30 +02:00
]
2011-04-05 14:56:20 +02:00
2010-08-30 02:12:39 +02:00
data = tablib.Dataset(*data, headers=headers)
2010-07-12 22:19:29 +02:00
2010-07-12 22:36:56 +02:00
2010-08-29 03:03:00 +02:00
Intelligently add new rows: ::
2010-07-12 22:36:56 +02:00
2010-09-14 06:01:59 +02:00
>>> data.append(('Henry', 'Ford'))
Intelligently add new columns: ::
>>> data.append_col((90, 67, 83), header='age')
2011-04-05 14:56:20 +02:00
2010-08-29 03:06:30 +02:00
Slice rows: ::
2010-07-12 22:36:56 +02:00
2010-09-12 19:48:48 +02:00
>>> print data[:2]
[('John', 'Adams', 90), ('George', 'Washington', 67)]
2011-04-05 14:56:20 +02:00
2010-07-12 22:36:56 +02:00
2010-09-12 05:08:48 +02:00
Slice columns by header: ::
2010-09-12 19:48:48 +02:00
>>> print data['first_name']
['John', 'George', 'Henry']
Easily delete rows: ::
>>> del data[1]
2010-09-12 05:08:48 +02:00
2010-09-25 23:12:50 +02:00
Exports
-------
2010-09-12 19:48:48 +02:00
Drumroll please...........
2010-07-12 22:36:56 +02:00
2011-04-05 14:56:20 +02:00
JSON!
2010-09-12 19:55:52 +02:00
+++++
::
2010-07-12 22:36:56 +02:00
2010-09-21 03:37:08 +02:00
>>> print data.json
2010-09-12 19:48:48 +02:00
[
{
"last_name": "Adams",
"age": 90,
"first_name": "John"
},
{
"last_name": "Ford",
"age": 83,
"first_name": "Henry"
}
]
2011-04-05 14:56:20 +02:00
YAML!
2010-09-12 19:55:52 +02:00
+++++
::
2010-09-12 19:48:48 +02:00
2010-09-21 03:37:08 +02:00
>>> print data.yaml
2010-09-12 19:48:48 +02:00
- {age: 90, first_name: John, last_name: Adams}
- {age: 83, first_name: Henry, last_name: Ford}
2011-04-05 14:56:20 +02:00
CSV...
2010-09-12 19:55:52 +02:00
++++++
::
2010-09-12 19:28:55 +02:00
2010-09-21 03:37:08 +02:00
>>> print data.csv
2011-04-05 14:56:20 +02:00
first_name,last_name,age
John,Adams,90
Henry,Ford,83
EXCEL!
2010-09-12 19:55:52 +02:00
++++++
::
2010-09-12 19:48:48 +02:00
>>> with open('people.xls', 'wb') as f:
... f.write(data.xls)
2010-09-12 19:48:48 +02:00
It's that easy.
2010-09-25 23:12:50 +02:00
2010-09-12 19:28:55 +02:00
Installation
------------
To install tablib, simply: ::
$ pip install tablib
2011-04-05 14:56:20 +02:00
2010-09-12 19:28:55 +02:00
Or, if you absolutely must: ::
$ easy_install tablib
2011-04-05 14:56:20 +02:00
2010-09-12 19:28:55 +02:00
Contribute
----------
2011-04-05 14:56:20 +02:00
If you'd like to contribute, simply fork `the repository`_, commit your
changes to the **develop** branch (or branch off of it), and send a pull
request. Make sure you add yourself to AUTHORS_.
2010-09-12 19:28:55 +02:00
2010-09-02 06:21:03 +02:00
2011-03-24 11:16:34 +01:00
2010-09-12 19:28:55 +02:00
.. _`the repository`: http://github.com/kennethreitz/tablib
2010-09-21 03:37:32 +02:00
.. _AUTHORS: http://github.com/kennethreitz/tablib/blob/master/AUTHORS