Skip to content

BUG: Wrong Custom Formatters applied when displaying trancated frames #35410

@ipcoder

Description

@ipcoder

Problem description

I am providing custom formatters for specific columns as dict.
If frame is large enough and some columns are truncated - then wrong formatters are applied to the columns.
(In my case that leads to crushes as wrong data type is received by the formatter).

Please notice, that behavior changes depending on the width of the console window as different columns are displayed.

Problem investigation

I have examined the code of my version of panda (1.0.5) and compared with the last version in GitHub - the bug seems to be still there.

The source of the problem starts with this method (DataFrameFormatter._to_str_columns), when
frame is set to truncated frame = self.tr_frame and then self._format_col(i) is called with index of the column in the TRUNCATED frame:

    def _to_str_columns(self) -> List[List[str]]:
        """
        Render a DataFrame to a list of columns (as lists of strings).
        """
        # this method is not used by to_html where self.col_space
        # could be a string so safe to cast
        self.col_space = cast(int, self.col_space)

        frame = self.tr_frame
        # may include levels names also

        str_index = self._get_formatted_index(frame)

        if not is_list_like(self.header) and not self.header:
            stringified = []
            for i, c in enumerate(frame):
                fmt_values = self._format_col(i)

Then this "truncated" column index is passed to self._get_formatter:

    def _format_col(self, i: int) -> List[str]:
        frame = self.tr_frame
        formatter = self._get_formatter(i)   # the problem is HERE? _get_formatter(frame.columns[i]) ?

which uses full frame columns to retrieve formatter using index i which corresponds to the columns of the truncated frame:

           # ...
        else:
            if is_integer(i) and i not in self.columns:
                i = self.columns[i]
            return self.formatters.get(i, None)
Details

INSTALLED VERSIONS

commit : None
python : 3.6.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-37-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.0.5
numpy : 1.18.5
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.0.0.post20200309
Cython : 0.29.15
pytest : 5.4.1
hypothesis : 5.19.3
sphinx : 2.4.0
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.16.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.2
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.2.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.15
tables : 3.4.4
tabulate : 0.8.3
xarray : 0.15.0
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.8
numba : 0.50.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions