Pandas is one of the popular data analysis libraries that helps in understanding data-centric insights. Every DataFrame comes with a 2 dimensional data collection. In this article, you will learn about the astype() method and its importance in Python.
astype() Method:
DataFrame.astype() method helps in casting a Pandas object to a specified data type. This method also allows converting any standard existing DataFrame column to any definite type. This becomes very useful when programmers want to cast any column or entire DataFrame's data from one type to another.
Syntax:
DataFrame.astype(dtype, copy = True, errors = ’raise’)
where, dtype is any valid NumPy.dtype or Python data type that tells the function to cast the entire DataFrame or its column object to that mentioned type. The copy parameter returns a copy of the data when the copy value is set to True. Lastly, the errors parameter will control the raising of exceptions if any invalid data is given or found for dtype.
Example:
import numpy as np
import pandas as pd
dat = {'c1': [12, 14], 'c2': [16, 18]}
df = pd.DataFrame(data = dat)
df.dtypes
Output:
Now, applying the astype() method in the next line, you can change or convert the DataFrame’s type to any other valid data type.
>>> df.astype('int32').dtypes
Output:
astype() with the DataFrame objects:
Category is another data type that help data analysts manage DataFrame values. By default DataFrames created using dictionary are in Object form / data type. But, if you use the astype() then you can convert them to category type.
Program:
import pandas as pd
dat = {"Gender":['M','M','M','F','M','F','M'], "NAME":['Karlos','Gaurav','Ray','Dee','Steve','Su','Ganesh']}
b = pd.DataFrame(dat)
print(" Give Data and their type is: \n")
print(b)
b.dtypes
Output:
Now, to change its type from object to category, you have to use the astype() method.
Program:
import pandas as pd
dat = {"Gender":['M','M','M','F','M','F','M'], "NAME":['Karlos','Gaurav','Ray','Dee','Steve','Su','Ganesh']}
b = pd.DataFrame(dat)
print(" Give Data and their type is: \n")
print(b)
b.dtypes
b['Gender'] = b['Gender'].astype('category')
b.dtypes
Output:
Conclusion:
Since DataFrames takes different types of data for analysis and calculations, it is very essential to understand the type of data they are holding at different point in a program. Also, to bring all the different types of similar data (int 16, int32, int64, etc.) under one category, astype() turns out to be a helpful tool.