Skip to content

BUG: DataFrame.to_stata() uses wrong struct formats and crashes in int64 #6327

Closed
@bashtage

Description

@bashtage

The relevant code is

self.DTYPE_MAP = \
    dict(
        lzip(range(1, 245), ['a' + str(i) for i in range(1, 245)]) +
        [
            (251, np.int16),
            (252, np.int32),
            (253, np.int64),
            (254, np.float32),
            (255, np.float64)
        ]
    )

and

self.TYPE_MAP = lrange(251) + list('bhlfd')

which maps h to int32 and l to int64.

http://docs.python.org/2/library/struct.html#format-characters

shows that h is 2 bytes and l is 4, and so trying to run

struct.pack('<l',2**40)

produces an error.

The obvious fix is to use

self.TYPE_MAP = lrange(251) + list('blqfd')

but this will probably produce errors on 32-bit platforms.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDtype ConversionsUnexpected or buggy dtype conversionsIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions