pandas Operations part2

Estimated reading: 2 minutes 318 views

astype

using astype we can convert one type to another.

				
					Data = {'Name': ['GeeksForGeeks','Python'],
          'Unique ID': ['900','450'],
        'distance': ['40.2','30.5']}
df = pd.DataFrame(Data)
df['Unique ID'] = df['Unique ID'].astype(int)
df['distance'] = df['distance'].astype(float)
df.dtypes
#df
				
			

where

using where , we can Filter the data.

				
					df = pd.DataFrame({'Type':list('ABBC'), 'Set':list('ZZXY')})
df['color'] = np.where(df['Set']=='Z', 'green', 'red')
print(df)
				
			

replace

using replace, we can replace from source value to destination.

				
					dk=pd.DataFrame({"BrandName":['A','B','ABC','D','AB'],"Specialty":['H','I','J','K','L']})
print(dk)
print(dk.BrandName.replace(to_replace=['ABC','AB'],value=['A','B']))
				
			

drop_duplicates

using drop_duplicates , we are dropping the data.

				
					import pandas as pd  
emp = {"Name": ["Parker", "Smith", "William", "Parker"],  
"Age": [21, 32, 29, 21]}  
#info = pd.DataFrame(emp)  
info = info.drop_duplicates()  
#print(info)  
info
				
			

sort_index

it will sort using index.

				
					#now let us create our own datasets and perform the operations

students = [ ('Jack', 34, 'Sydney') ,

             ('Riti', 31, 'Delhi' ) ,

             ('Aadi', 16, 'New York') ,

             ('Riti', 32, 'Delhi' ) ,

             ('Riti', 33, 'Delhi' ) ,

             ('Riti', 35, 'Mumbai' ),

             ('Ajay', 21, 'Hyderabad')

             ]

#create a dataframe object
df3=pd.DataFrame(students,columns=['Name','Marks','City'], index=[10,15,7,8,6,1,3])
df3

#observe the difference between below 3 operations on sort function

#let us perform the sort by index
df3.sort_index() # sorted by index only
df3.sort_index(axis=0, ascending=False)
#axis = 0 is by row
				
			

value_counts

return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.

				
					#df3['Name'].value_counts()
df3['Name'].value_counts()
				
			

Leave a Comment

CONTENTS