Skip to content

Improve the performance of instantiating a Series object with dictionary data and a datetimeindex #14894

Closed
@nateyoder

Description

@nateyoder

The current code path always results in an exception on:

data = lib.fast_multiget(data, index.astype('O'),

which is then caught. Only a slight performance advantage is seen but hopefully the code change makes it less confusing for newcomers like me.

Code Sample, a copy-pastable example if possible

dr = pd.date_range(
            start=datetime(2015, 10, 26),
            end=datetime(2016, 1, 1),
            freq='10s'
        )
data = {d: v for d, v in zip(dr, range(len(dr)))}
s = Series(data=data, index=dr)

Problem description

The current code path always results in an exception on:

data = lib.fast_multiget(data, index.astype('O'),

which is then caught. Only a slight performance advantage is seen but hopefully the code change makes it less confusing for newcomers like me.

ASV output of new benchmark
Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[ 0.00%] · For pandas commit hash 5f05fdc:
[ 0.00%] ·· Building for conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt.................................
[ 0.00%] ·· Benchmarking conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ··· Running ...x.time_series_constructor_no_data_datetime_index 3.26s
[ 50.00%] · For pandas commit hash 3ba2cff:
[ 50.00%] ·· Building for conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt...
[ 50.00%] ·· Benchmarking conda-py2.7-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[100.00%] ··· Running ...x.time_series_constructor_no_data_datetime_index 3.77s before after ratio
[3ba2cff] [5f05fdc]

  • 3.77s      3.26s      0.87  series_methods.series_constructor_dict_data_datetime_index.time_series_constructor_no_data_datetime_index
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypePerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

        翻译: