Specify CSV file data types in Pandas

Python

Pandas

CSV

Specify data types

Papa D.

Python

Pandas

CSV

When loading CSV files, Pandas regularly infers data types incorrectly. To avoid this, programmers can manually specify the types of specific columns.

Code Example

Use the dtype argument to pd.read_csv() to specify column data types.

import pandas as pd
import numpy as np

# Specify column data types here
deniro_movies = pd.read_csv('deniro.csv', dtype={
    'Year': str,
    'Score': np.int,
    'Title': str
}) 

deniro_movies.head()
Year Score Title
0 1968 86 Greetings
1 1970 17 Bloody Mama
2 1970 73 Hi, Mom!
3 1971 40 Born to Win
4 1973 98 Mean Streets

deniro.csv is a CSV file that contains the following:

"Year","Score","Title"
1968,86,"Greetings"
1970,17,"Bloody Mama"
1970,73,"Hi, Mom!"
1971,40,"Born to Win"
1973,98,"Mean Streets"
1973,88,"Bang the Drum Slowly"
1974,97,"The Godfather, Part II"
1976,41,"The Last Tycoon"
1976,99,"Taxi Driver"