csvkit v1.0.0 Release Notes
Release Date: 2016-12-27 // over 7 years ago-
๐ This is the first major release of csvkit in a very long time. The entire backend has been rewritten to leverage the
agate <http://agate.rtfd.io>
_ data analysis library, which was itself inspired by csvkit. The new backend provides better type detection accuracy, as well as some new features.๐ Because of the long and complex cycle behind this release, the list of changes should not be considered exhaustive. In particular, the output format of some tools may have changed in small ways. Any existing data pipelines using csvkit should be tested as part of the upgrade.
๐ Much of the credit for this release goes to
James McKinney <https://github.com/jpmckinney>
_, who has almost single-handedly kept the csvkit fire burning for a year. Thanks, James!Backwards-incompatible changes:
- ๐ :doc:
/scripts/csvjoin
now renames duplicate columns with integer suffixes to prevent collisions in output. - :doc:
/scripts/csvsql
now generatesDateTime
columns instead ofTime
columns. - :doc:
/scripts/csvsql
now generatesDecimal
columns instead ofInteger
,BigInteger
, andFloat
columns. - :doc:
/scripts/csvsql
no longer generates max-length constraints for text columns. - The
--doublequote
long flag is gone, and the-b
short flag is now an alias for--no-doublequote
. - When using the
--columns
or--not-columns
options, you must not have spaces around the comma-separated values, unless the column names contain spaces. - When sorting, null values are now greater than other values instead of less than.
- ๐
CSVKitReader
,CSVKitWriter
,CSVKitDictReader
, andCSVKitDictWriter
have been removed. Useagate.csv.reader
,agate.csv.writer
,agate.csv.DictReader
andagate.csv.DictWriter
. - โฌ๏ธ Drop Python 2.6 support (end-of-life was October 29, 2013).
- โฌ๏ธ Drop support for older versions of PyPy.
- If
--no-header-row
is set, the output will have column namesa
,b
,c
, etc. instead ofcolumn1
,column2
,column3
, etc. - csvlook renders a simpler, markdown-compatible table.
๐ Improvements:
- โ csvkit is now tested against Python 3.6. (#702)
import csvkit as csv
will now defer to agate readers/writers.- ๐ :doc:
/scripts/csvgrep
supports--no-header-row
. - ๐ :doc:
/scripts/csvjoin
supports--no-header-row
. - :doc:
/scripts/csvjson
streams input and output if the--stream
and--no-inference
flags are set. - ๐ :doc:
/scripts/csvjson
supports--snifflimit
and--no-inference
. - :doc:
/scripts/csvlook
adds--max-rows
,--max-columns
and--max-column-width
options. - ๐ :doc:
/scripts/csvlook
supports--snifflimit
and--no-inference
. - ๐ :doc:
/scripts/csvpy
supports--agate
to read a CSV file into an agate table. - โ
csvsql
supports customSQLAlchemy dialects <http://docs.sqlalchemy.org/en/latest/dialects/>
_. - ๐ :doc:
/scripts/csvstat
supports--names
. - :doc:
/scripts/in2csv
CSV-to-CSV conversion streams input and output if the--no-inference
flag is set. - :doc:
/scripts/in2csv
CSV-to-CSV conversion usesagate.Table
. - :doc:
/scripts/in2csv
GeoJSON conversion adds columns for geometry type, longitude and latitude. - ๐ Documentation: Update tool usage, remove shell prompts, document connection string, correct typos.
๐ Fixes:
- ๐ Fixed numerous instances of open files not being closed before utilities exit.
- ๐ Change
-b
,--doublequote
to--no-doublequote
, as doublequote is True by default. - :doc:
/scripts/in2csv
DBF conversion works with Python 3. - :doc:
/scripts/in2csv
correctly guesses format when file has an uppercase extension. - :doc:
/scripts/in2csv
correctly interprets--no-inference
. - ๐ :doc:
/scripts/in2csv
again supports nested JSON objects (fixes regression). - ๐จ :doc:
/scripts/in2csv
with--format geojson
will print a JSON object instead ofOrderedDict([(...)])
. - ๐ :doc:
/scripts/csvclean
with standard input works on Windows. - :doc:
/scripts/csvgrep
returns the input file's line numbers if the--linenumbers
flag is set. - :doc:
/scripts/csvgrep
can match multiline values. - :doc:
/scripts/csvgrep
correctly operates on ragged rows. - :doc:
/scripts/csvsql
correctly escapes%
` characters in SQL queries. - :doc:
/scripts/csvsql
adds standard input only if explicitly requested. - ๐ :doc:
/scripts/csvstack
supports stacking a single file. - :doc:
/scripts/csvstat
always reports frequencies. - The
any_match
argument ofFilteringCSVReader
now works correctly. - All tools handle empty files without error.
- ๐ :doc: