Commit Graph

14346 Commits (0.19.x)

Author SHA1 Message Date
Jeremy Gray 12b3264d17 DOC: typo: comverted -> converted (#15977) 2017-04-11 22:22:10 +02:00
James Draper ead784f2b4 Changed pandas-qt python2/3 friendly qtpandas. (#14818)
Just changed the link to the abandoned repository python 2 only pandas-qt to the new functional Python 2/3 friendly qtpandas.
2016-12-29 15:56:28 +01:00
Joris Van den Bossche 825876ca7e RLS: v0.19.2 2016-12-24 16:53:18 +01:00
Joris Van den Bossche 87535f7fb0 DOC: update release notes for 0.19.2 2016-12-24 15:18:58 +01:00
Jeff Reback cd2d1d30a9 TST: skip gbq upload test as flakey
(cherry picked from commit 45910ae646)
2016-12-24 14:53:29 +01:00
Joris Van den Bossche eabbc9744e DOC: clean-up v0.19.2 whatsnew 2016-12-24 13:18:03 +01:00
Dr-Irv bf9317d449 DOC: update Pandas Cheat Sheet (GH13202)
closes #14963

(cherry picked from commit 9cf1f7eb54)
2016-12-24 12:39:49 +01:00
Dr-Irv 9bf1ace77d DOC: Pandas Cheat Sheet
closes #13202
closes #14943

(cherry picked from commit f79bc7a9d1)
2016-12-24 12:39:39 +01:00
Joris Van den Bossche 419819d6e8 TST: matplotlib 2.0 fix in log limits for barplot (GH14808) (#14957)
(cherry picked from commit f293d6219d)
2016-12-24 11:53:48 +01:00
Joris Van den Bossche 41ab067dfc flake8 fix import 2016-12-24 11:49:44 +01:00
Joris Van den Bossche 2c7e79d709 Remove test - from 0.20.0 PR slipped in 2016-12-24 03:50:00 +01:00
Jeff Reback 5110eaf1b7 PERF: fix getitem unique_check / initialization issue
closes #14930

Author: Jeff Reback <jeff@reback.net>

Closes #14933 from jreback/perf and squashes the following commits:

dc32b39 [Jeff Reback] PERF: fix getitem unique_check / initialization issue

(cherry picked from commit 07c83eedba)
2016-12-24 03:29:35 +01:00
Maximilian Roos a8d8fae410 cache and remove boxing (#14931)
(cherry picked from commit 4c3d4d4fbb)
2016-12-24 03:28:03 +01:00
Chris Ham 85bc6d7cd8 CLN: Resubmit of GH14700. Fixes GH14554. Errors other than Indexing…
IdnexError and KeyError now bubble up appropriately.

closes #14554

Author: Chris Ham <chris@christopher-ham.com>

Closes #14912 from clham/gh14554-b and squashes the following commits:

458c0cc [Chris Ham] CLN: Resubmit of GH14700.  Fixes GH14554.  Errors other than IndexingError and KeyError now bubble up appropriately.

(cherry picked from commit 3ccb50131b)
2016-12-24 03:24:38 +01:00
Nate Yoder 21ebc0fa54 Clean up construction of Series with dictionary and datetime index
closes #14894
Fix usage of fast_multiget with index which was always throwing an
exception that was then caught; add ASV that show slight improvement

Author: Nate Yoder <nate@whistle.com>

Closes #14895 from nateyoder/series_dict_index and squashes the following commits:

56be091 [Nate Yoder] Update whatsnew and fix pep8 issue
5f05fdc [Nate Yoder] Fix usage of fast_multiget with index which was always throwing an exception that was then caught; add ASV that show slight improvement

(cherry picked from commit e503d40ace)
2016-12-24 03:22:39 +01:00
Rodolfo Fernandez bbb76869e7 BUG: .fillna() for datetime64 with tz is passing thru floats
closes #14872

Author: Rodolfo Fernandez <opensourceworkAR@users.noreply.github.com>

Closes #14905 from RodolfoRFR/pandas-14872-e and squashes the following commits:

18802b4 [Rodolfo Fernandez] added 'self' to test_dtype_utc function in pandas/tests/series/test_missing
e0c6c7c [Rodolfo Fernandez] added line to whatsnew v0.19.2 and test to test_missing.py in series folder
e4ba7e0 [Rodolfo Fernandez] removed all references to _DATELIKE_DTYPES from /pandas/core/missing.py
5d37ce8 [Rodolfo Fernandez] added is_datetime64tz_dtype and changed evaluation from 'values' to dtype
19eecb2 [Rodolfo Fernandez] fixed style errors using flake8
59b91a1 [Rodolfo Fernandez] test modified
5a59eac [Rodolfo Fernandez] test modified
bc68bf7 [Rodolfo Fernandez] test modified
ba83fc8 [Rodolfo Fernandez] test
b7358de [Rodolfo Fernandez] bug fixed

(cherry picked from commit f3c5a427cc)
2016-12-24 03:22:05 +01:00
gfyoung c9e5bf41f7 BUG: Patch read_csv NA values behaviour
Patches the following behaviour when `na_values` is passed in as a
dictionary:    1. Prevent aliasing in case `na_values` was defined in
a broader scope.  2. Respect column indices as keys when doing NA
conversions.    Closes #14203.

Author: gfyoung <gfyoung17@gmail.com>

Closes #14751 from gfyoung/csv-na-values-patching and squashes the following commits:

cac422c [gfyoung] BUG: Respect column indices for dict-like na_values
1439c27 [gfyoung] BUG: Prevent aliasing of dict na_values

(cherry picked from commit dd8cba2767)
2016-12-24 03:15:00 +01:00
Christopher C. Aycock c520b25944 ENH: merge_asof() has type specializations and can take multiple 'by' parameters (#13936)
closes #13936

Author: Christopher C. Aycock <christopher.aycock@twosigma.com>

Closes #14783 from chrisaycock/GH13936 and squashes the following commits:

ffcf0c2 [Christopher C. Aycock] Added test to reject float16; fixed typos
1f208a8 [Christopher C. Aycock] Use tuple representation instead of strings
77eb47b [Christopher C. Aycock] Merge master branch into GH13936
89256f0 [Christopher C. Aycock] Test 8-bit integers and raise error on 16-bit floats; add comments
0ad1687 [Christopher C. Aycock] Fixed whatsnew
2bce3cc [Christopher C. Aycock] Revert dict back to PyObjectHashTable in response to code review
fafbb02 [Christopher C. Aycock] Updated benchmarks to reflect new ASV setup
5eeb7d9 [Christopher C. Aycock] Merge master into GH13936
c33c4cb [Christopher C. Aycock] Merge branch 'master' into GH13936
46cc309 [Christopher C. Aycock] Update documentation
f01142c [Christopher C. Aycock] Merge master branch
75157fc [Christopher C. Aycock] merge_asof() has type specializations and can take multiple 'by' parameters (#13936)

(cherry picked from commit e7df7516ff)
2016-12-24 03:11:34 +01:00
Joris Van den Bossche a5091722a3 [Backport #14886] BUG: regression in DataFrame.combine_first with integer columns (GH14687) (#14886)
(cherry picked from commit 992dfbc6f5)
2016-12-24 03:08:18 +01:00
Keshav Ramaswamy 435947134c Fixed KDE Plot to drop the missing values (#14820)
BUG: Fixed KDE plot to ignore missing values

 closes #14821

* fixed kde plot to ignore the missing values
* added comment to elaborate the changes made
* added a release note in whatsnew/0.19.2
* added test to check for  missing values and cleaned up whatsnew doc
* added comment to refer the issue
* modified to fit lint checks
* replaced ._xorig with .get_xdata()
(cherry picked from commit 033d34596f)
2016-12-24 03:07:48 +01:00
Christopher C. Aycock 9a6a78f36b ENH: merge_asof() has left_index/right_index and left_by/right_by (#14253) (#14531)
(cherry picked from commit 84cad61556)
2016-12-24 03:05:09 +01:00
Joris Van den Bossche f1d43a4b50 TST: correct url for test file on s3 (xref #14587)
(cherry picked from commit ed2173695d)
2016-12-15 14:06:56 +01:00
Daniel Himmelstein 42bc79cfaa TST: Create compressed salary testing data (#14587)
(cherry picked from commit 85a6464407)
2016-12-15 14:06:52 +01:00
Jeff Reback 03d3f185bb DOC: whatsnew 0.19.2 2016-12-15 06:54:57 -05:00
hesham.shabana@hotmail.com 59b2520282 DOC: fix groupby.rst for building issues
closes #14861
closes #14863

(cherry picked from commit 96b171a659)
2016-12-15 10:55:15 +01:00
Pietro Battiston 26920d1073 BUG: Apply min_itemsize to index even when not appending
closes #10381

Author: Pietro Battiston <me@pietrobattiston.it>

Closes #14812 from toobaz/to_hdf_min_itemsize and squashes the following commits:

c07f1e4 [Pietro Battiston] Whatsnew
38b8fcc [Pietro Battiston] Tests for previous commit
c838afa [Pietro Battiston] BUG: set min_itemsize even when there is no need to validate (#10381)

(cherry picked from commit e833096244)
2016-12-15 10:54:10 +01:00
Christopher C. Aycock 7f53ea8fac BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)
closes #14844

Author: Christopher C. Aycock <christopher.aycock@twosigma.com>

Closes #14845 from chrisaycock/GH14844 and squashes the following commits:

97b73a8 [Christopher C. Aycock] BUG: Allow TZ-aware DatetimeIndex in merge_asof() (#14844)

(cherry picked from commit e991141f3c)
2016-12-15 10:52:50 +01:00
Pawel Kordek 1bc64b1f5c BUG: GH11847 Unstack with mixed dtypes coerces everything to object
closes #11847

Changed the way
in which the original data frame is copied (dropped use of .values,
since it does not preserve dtypes).

Author: Pawel Kordek <pawel.kordek@gmail.com>

Closes #14053 from kordek/#11847 and squashes the following commits:

6a381ce [Pawel Kordek] BUG: GH11847 Unstack with mixed dtypes coerces everything to object

(cherry picked from commit d531718749)
2016-12-15 10:52:30 +01:00
Jeff Reback 3276c8aeb2 TST: skip testing on windows for specific formatting which sometimes hangs (#14851)
xref #14626
(cherry picked from commit 34807fc25e)
2016-12-15 10:50:18 +01:00
wandersoncferreira bcd76ed94d DOC: add section on groupby().rolling/expanding/resample (#14801) 2016-12-15 10:49:46 +01:00
Jeff Reback dc23751b44 BUG: fix hash collisions for from int overflow (#14805)
* BUG: we don't like hash collisions in siphash

xref #14767

* This should be a 64-bit int, not an 8-bit int

* fix tests

(cherry picked from commit 51f725f7e8)
2016-12-15 10:46:06 +01:00
Jeff Reback 13f28f558b COMPAT: numpy compat with 1-ndim object array compat and broadcasting (#14809)
xref #14808
(cherry picked from commit 0412732222)
2016-12-15 10:45:50 +01:00
Matt Roeschke 11eb8abac0 BUG: _nsorted incorrect with duplicated values in index
closes #13412
closes #14707

(cherry picked from commit 6e514dacc1)
2016-12-15 10:45:10 +01:00
Jeff Carey 96cac411db BUG: Corrects stopping logic when nrows argument is supplied (#7626)
closes #7626

Subsets of tabular files with different "shapes"
will now load when a valid skiprows/nrows is given as an argument   -

Conditions
for error:  1) There are different "shapes" within a tabular data
file, i.e. different numbers of columns.  2) A "narrower" set of
columns is followed by a "wider" (more columns) one, and the narrower
set is laid out such that the end of a 262144-byte block occurs within
it.    Issue summary:   The C engine for parsing files reads in 262144
bytes at a time. Previously, the "start_lines" variable in
tokenizer.c/tokenize_bytes() was set incorrectly to the first line in
that chunk, rather than the overall first row requested. This lead to
incorrect logic on when to stop reading when nrows is supplied by the
user. This always happened but only caused a crash when a wider set of
columns followed in the file. In other cases, extra rows were read in
but then harmlessly discarded.    This pull request always uses the
first requested row for comparisons, so only nrows will be parsed
when supplied.

Author: Jeff Carey <jeff.carey@gmail.com>

Closes #14747 from jeffcarey/fix/7626 and squashes the following commits:

cac1bac [Jeff Carey] Removed duplicative test
6f1965a [Jeff Carey] BUG: Corrects stopping logic when nrows argument is supplied (Fixes #7626)

(cherry picked from commit 4378f82967)

 Conflicts:
	pandas/io/tests/parser/c_parser_only.py
2016-12-15 10:44:42 +01:00
Pietro Battiston 90e19223af BUG: Ensure min_itemsize is always a list (#11412)
closes #11412

Author: Pietro Battiston <me@pietrobattiston.it>

Closes #14728 from toobaz/minitemsizefix and squashes the following commits:

e25cd1f [Pietro Battiston] Whatsnew
b9bb88f [Pietro Battiston] Tests for previous commit
6406ee8 [Pietro Battiston] BUG: Ensure min_itemsize is always a list

(cherry picked from commit 53bf1b27c7)
2016-12-15 10:42:53 +01:00
Matt Roeschke 36dad8418f BUG: Bug upon Series.Groupby.nunique with empty Series
closes #12553
closes #14770

(cherry picked from commit c0e13d1bcc)
2016-12-15 10:41:21 +01:00
Jeff Reback 04b83e021b [Backport #14777] BUG: Bug in a groupby of a non-lexsorted MultiIndex
closes #14776

Author: Jeff Reback <jeff@reback.net>

Closes #14777 from jreback/mi_sort and squashes the following commits:

cf31905 [Jeff Reback] BUG: Bug in a groupby of a non-lexsorted MultiIndex and multiple grouping levels

(cherry picked from commit f23010aa93)
2016-12-15 10:39:19 +01:00
Chris 7814a6654e [Backport #14791] BUG: multi-index HDFStore data_columns=True
closes #14435

Author: Chris <cbartak@gmail.com>

Closes #14791 from chris-b1/hdf-mi-datacolumns and squashes the following commits:

5d32610 [Chris] BUG: multi-index HDFStore data_columns=True

(cherry picked from commit 27fcd811f5)
2016-12-15 10:38:57 +01:00
Joris Van den Bossche 95f088fbb7 DOC: specify link to frequencies (#14760)
(cherry picked from commit 2bd9c95ffe)
2016-12-15 10:37:26 +01:00
Joris Van den Bossche d22d155957 PEP8: fix line length
(cherry picked from commit 87beca3d0d)
2016-12-15 10:36:32 +01:00
sinhrks 7479d4185f [Backport #12745] PERF: Improve replace perf
When .replace is called with
`dict`, replacements are done per value. Current impl try to soft
convert the dtype in every replacement, but it is enough to be done in
the final replacement.

Author: sinhrks <sinhrks@gmail.com>

Closes #12745 from sinhrks/replace_perf and squashes the following commits:

ffc59b0 [sinhrks] PERF: Improve replace perf

(cherry picked from commit e299560dff)
2016-12-15 10:36:01 +01:00
Jeff Reback 560aded980 [Backport #14767] ERR: raise on python in object hashing, only supporting strings, nulls
xref #14729

Author: Jeff Reback <jeff@reback.net>

Closes #14767 from jreback/hashing_object and squashes the following commits:

9a5a5d4 [Jeff Reback] ERR: raise on python in object hashing, only supporting strings, nulls

(cherry picked from commit de1132d878)
2016-12-15 10:35:12 +01:00
Jeff Reback 612508a0ce BLD: clean .pxi when cleaning (#14766)
(cherry picked from commit 43c24e621d)
2016-12-15 10:33:51 +01:00
Tara Adiseshan 08c8cf63ed added read_msgpack() to index (#14765)
(cherry picked from commit e3de052664)
2016-12-15 10:33:37 +01:00
gfyoung 8fda0c999a [Backport #14749] BUG: Improve error message for skipfooter malformed rows in Python engine (#14749)
Python's native CSV library does not respect the
skipfooter parameter, so if one of those skipped
rows is malformed, it will still raise an error.

Closes gh-13879.
(cherry picked from commit dfeae396c8)
2016-12-15 10:33:02 +01:00
Yaroslav Halchenko 5c72726458 [Backport #14756] BF: (re)raise the exception always unless returning (#14756)
otherwise leads atm to masking of this error while testing on i386
and then failling since

UnboundLocalError: local variable unser referenced before assignment

More detail: https://buildd.debian.org/status/fetch.php?pkg=pandas&arch=i386&ver=0.19.1-1&stamp=1479504883
(cherry picked from commit 2f43ac4c4c)
2016-12-15 10:32:17 +01:00
Jeff Reback 59f633f330 ENH: add data hashing routines (#14729)
xref https://github.com/dask/dask/pull/1807
(cherry picked from commit 06f26b51e9)
2016-12-15 10:31:29 +01:00
Kerby Shedden 6c688b947c [Backport #14743] BUG: SAS chunksize / iteration issues (#14743)
closes #14734
closes #13654
(cherry picked from commit c5f219acfc)
2016-12-15 10:30:41 +01:00
Joris Van den Bossche 68c7529d79 [Backport #14330] BUG: mixed freq timeseries plotting with shared axes (GH13341) (#14330)
(cherry picked from commit 6d2b34af75)
2016-12-15 10:30:12 +01:00
gfyoung 4a4bbace64 BUG: Improve error message for multi-char sep and quotes in Python engine (#14582)
If there is a field counts mismatch, check whether
a multi-char sep was used in conjunction with quotes.
Currently, that setup is not respected and can result
in improper line breaks.

Closes gh-13374.
(cherry picked from commit d8e427bda0)
2016-12-15 10:29:36 +01:00