Wednesday, September 18, 2024
HomePythonPandas Dataframe Index in Python

Pandas Dataframe Index in Python


Pandas dataframes are one of the vital used knowledge constructions for knowledge evaluation and machine studying duties in Python. On this article, we are going to talk about the best way to create and delete an index from a pandas dataframe. We may also talk about multilevel indexing in a pandas dataframe and the way we will entry components from a dataframe utilizing dataframe indices. 

What Is a Pandas Dataframe Index?

Similar to a dataframe has column names, you’ll be able to take into account an index as a row label. Once we create a dataframe, the rows of the dataframe are assigned indices ranging from 0 until the variety of rows minus one as proven under.

import pandas as pd
list1=[1,2,3]
list2=[3,55,34]
list3=[12,32,45]
myList=[list1,list2,list3]
myDf=pd.DataFrame(myList,columns=["A", "B", "C"])
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
    A   B   C
0   1   2   3
1   3  55  34
2  12  32  45
The index is:
[0, 1, 2]

Create an Index Whereas Making a Pandas Dataframe

You too can create customized indices whereas making a dataframe. For this, you should utilize the index parameter of the DataFrame() operate. The index parameter takes a listing of values and assigns the values as indices of the rows within the dataframe. You may observe this within the following instance.

import pandas as pd
list1=[1,2,3]
list2=[3,55,34]
list3=[12,32,45]
myList=[list1,list2,list3]
myDf=pd.DataFrame(myList,columns=["A", "B", "C"],index=[101,102,103])
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
      A   B   C
101   1   2   3
102   3  55  34
103  12  32  45
The index is:
[101, 102, 103]

Within the above instance, we have now created the index of the dataframe utilizing the checklist [101, 102, 103] and the index parameter of the DataFrame() operate.

Right here, it’s essential guarantee that the variety of components within the checklist handed to the index parameter must be equal to the variety of rows within the dataframe. In any other case, this system will run right into a ValueError exception as proven under.

import pandas as pd
list1=[1,2,3]
list2=[3,55,34]
list3=[12,32,45]
myList=[list1,list2,list3]
myDf=pd.DataFrame(myList,columns=["A", "B", "C"],index=[101,102,103,104])
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

ValueError: Size of values (3) doesn't match size of index (4)

Within the above instance, you’ll be able to observe that we have now handed 4 components within the checklist handed to the index parameter. Nevertheless, the dataframe has solely three rows. Therefore, this system runs into Python ValueError exception.

Create Dataframe Index Whereas Loading a CSV File

If you’re making a dataframe a csv file and also you wish to make a column of the csv file because the dataframe index, you should utilize the index_col parameter within the read_csv() operate.

The index_col parameter takes the title of the column as its enter argument. After execution of the read_csv() operate, the required column is assigned because the index of the dataframe. You may observe this within the following instance.

myDf=pd.read_csv("samplefile.csv",index_col="Class")
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
       Roll      Identify
Class                
1        11    Aditya
1        12     Chris
1        13       Sam
2         1      Joel
2        22       Tom
2        44  Samantha
3        33      Tina
3        34       Amy
The index is:
[1, 1, 1, 2, 2, 2, 3, 3]

You too can move the place of a column title within the column checklist as an alternative of its title as an enter argument to the index_col parameter. As an illustration, if you wish to make the primary column of the pandas dataframe as its index, you’ll be able to move 0 to the index_col parameter within the DataFrame() operate as proven under.

myDf=pd.read_csv("samplefile.csv",index_col=0)
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
       Roll      Identify
Class                
1        11    Aditya
1        12     Chris
1        13       Sam
2         1      Joel
2        22       Tom
2        44  Samantha
3        33      Tina
3        34       Amy
The index is:
[1, 1, 1, 2, 2, 2, 3, 3]

Right here, the Class column is the primary column within the csv file. Therefore, it’s transformed into index of the dataframe.

The index_col parameter additionally takes a number of values as their enter. Now we have mentioned this within the part on multilevel indexing in dataframes.

Create an Index After Making a Pandas Dataframe

When a dataframe is created,  the rows of the dataframe are assigned indices ranging from 0 until the variety of rows minus one. Nevertheless, we will create a customized index for a dataframe utilizing the index attribute. 

To create a customized index in a pandas dataframe, we are going to assign a listing of index labels to the index attribute of the dataframe. After execution of the project assertion, a brand new index is created for the dataframe as proven under.

myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
myDf.index=[101,102,103,104,105,106,107,108]
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
     Class  Roll      Identify
101      1    11    Aditya
102      1    12     Chris
103      1    13       Sam
104      2     1      Joel
105      2    22       Tom
106      2    44  Samantha
107      3    33      Tina
108      3    34       Amy
The index is:
[101, 102, 103, 104, 105, 106, 107, 108]

Right here, you’ll be able to see that we have now assigned a listing containing numbers from 101 to108 to the index attribute of the dataframe. Therefore, the weather of the checklist are transformed into indices of the rows within the dataframe.

Do not forget that the entire variety of index labels within the checklist must be equal to the variety of rows within the dataframe. In any other case, this system will run right into a ValueError exception.

Convert Column of a DataFrame into Index

We will additionally use a column because the index of the dataframe. For this, we will use the set_index() technique. The set_index() technique, when invoked on a dataframe, takes the column title as its enter argument. After execution, it returns a brand new dataframe with the required column as its index as proven within the following instance.

myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
myDf=myDf.set_index("Class")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
       Roll      Identify
Class                
1        11    Aditya
1        12     Chris
1        13       Sam
2         1      Joel
2        22       Tom
2        44  Samantha
3        33      Tina
3        34       Amy
The index is:
[1, 1, 1, 2, 2, 2, 3, 3]

Within the above instance, we have now use the set_index() technique to create index from an present column of the dataframe as an alternative of a brand new sequence.

Change Index of a Pandas Dataframe

You may change the index column of a dataframe utilizing the set_index() technique. For this, you simply must move the column title of the brand new index column as enter to the set_index() technique as proven under.

myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
myDf=myDf.set_index("Class")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)
print("The modified dataframe is:")
newDf=myDf.set_index("Roll")
print(newDf)
print("The index is:")
index=checklist(newDf.index)
print(index)

Output:

The dataframe is:
       Roll      Identify
Class                
1        11    Aditya
1        12     Chris
1        13       Sam
2         1      Joel
2        22       Tom
2        44  Samantha
3        33      Tina
3        34       Amy
The index is:
[1, 1, 1, 2, 2, 2, 3, 3]
The modified dataframe is:
          Identify
Roll          
11      Aditya
12       Chris
13         Sam
1         Joel
22         Tom
44    Samantha
33        Tina
34         Amy
The index is:
[11, 12, 13, 1, 22, 44, 33, 34]

If you wish to assign a sequence as the brand new index to the dataframe, you’ll be able to assign the sequence to the index attribute of the pandas dataframe as proven under.

myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
myDf=myDf.set_index("Class")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)
print("The modified dataframe is:")
myDf.index=[101, 102, 103, 104, 105, 106, 107, 108]
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
       Roll      Identify
Class                
1        11    Aditya
1        12     Chris
1        13       Sam
2         1      Joel
2        22       Tom
2        44  Samantha
3        33      Tina
3        34       Amy
The index is:
[1, 1, 1, 2, 2, 2, 3, 3]
The modified dataframe is:
     Roll      Identify
101    11    Aditya
102    12     Chris
103    13       Sam
104     1      Joel
105    22       Tom
106    44  Samantha
107    33      Tina
108    34       Amy
The index is:
[101, 102, 103, 104, 105, 106, 107, 108]

Once we change the index column of a dataframe, the prevailing index column is deleted from the dataframe. Due to this fact, it is best to first retailer the index column into a brand new column of the dataframe earlier than altering the index column. In any other case, you’ll lose knowledge saved within the index column out of your dataframe. 

myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
myDf=myDf.set_index("Class")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)
print("The modified dataframe is:")
myDf["Class"]=myDf.index
myDf.index=[101, 102, 103, 104, 105, 106, 107, 108]
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
       Roll      Identify
Class                
1        11    Aditya
1        12     Chris
1        13       Sam
2         1      Joel
2        22       Tom
2        44  Samantha
3        33      Tina
3        34       Amy
The index is:
[1, 1, 1, 2, 2, 2, 3, 3]
The modified dataframe is:
     Roll      Identify  Class
101    11    Aditya      1
102    12     Chris      1
103    13       Sam      1
104     1      Joel      2
105    22       Tom      2
106    44  Samantha      2
107    33      Tina      3
108    34       Amy      3
The index is:
[101, 102, 103, 104, 105, 106, 107, 108]

Right here, you’ll be able to observe that we have now first saved the index into the Class column earlier than altering the index of the dataframe. Within the earlier instance, we hadn’t accomplished that. As a result of this, the information within the Class column was misplaced.

Create Multilevel Index in a Pandas Dataframe

You too can create a multilevel index in a dataframe. Multilevel indices allow you to entry hierarchical knowledge comparable to census knowledge which have completely different ranges of abstraction. We will create multilevel indices whereas creating the dataframe in addition to after creating the dataframe. That is mentioned as follows.

Create a Multilevel Index Whereas Making a Dataframe

To create a multilevel index utilizing completely different columns of a dataframe, you should utilize the index_col parameter within the read_csv() operate. The index_col parameter takes a listing of columns which have for use as indices. The order of the column names within the checklist given to the index_col parameter from left to proper is from highest to lowest stage of index. After execution of the read_csv() operate, you’ll get a dataframe with multilevel index as proven within the following instance.

myDf=pd.read_csv("samplefile.csv",index_col=["Class","Roll"])
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
                Identify
Class Roll          
1     11      Aditya
      12       Chris
      13         Sam
2     1         Joel
      22         Tom
      44    Samantha
3     33        Tina
      34         Amy
The index is:
[(1, 11), (1, 12), (1, 13), (2, 1), (2, 22), (2, 44), (3, 33), (3, 34)]

Within the above instance, the Class column comprises the primary stage of index and the Roll column comprises the second stage of index. To entry components from the dataframe, it’s essential know index at each the extent for any row.

As an alternative of utilizing the column names, it’s also possible to move the place of a column title within the column checklist as an alternative of its title as an enter argument to the index_col parameter. As an illustration, you’ll be able to assign the primary and third column of the dataframe as its index as proven under.

myDf=pd.read_csv("samplefile.csv",index_col=[0,1])
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
                Identify
Class Roll          
1     11      Aditya
      12       Chris
      13         Sam
2     1         Joel
      22         Tom
      44    Samantha
3     33        Tina
      34         Amy
The index is:
[(1, 11), (1, 12), (1, 13), (2, 1), (2, 22), (2, 44), (3, 33), (3, 34)]

Create a Multilevel Index After Making a Dataframe

You too can create a multilevel index after making a dataframe utilizing the set_index() technique. For this, you simply must move a listing of column names to the set_index() technique. Once more, the order of the column names within the checklist given to the index_col parameter from left to proper is from highest to lowest stage of index as proven under.

myDf=pd.read_csv("samplefile.csv")
print("The dataframe is:")
myDf=myDf.set_index(["Class","Roll"])
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
                Identify
Class Roll          
1     11      Aditya
      12       Chris
      13         Sam
2     1         Joel
      22         Tom
      44    Samantha
3     33        Tina
      34         Amy
The index is:
[(1, 11), (1, 12), (1, 13), (2, 1), (2, 22), (2, 44), (3, 33), (3, 34)]

That you must needless to say the set_index() technique removes the prevailing index column. If you wish to save the information saved within the index column, it is best to copy the information into one other column earlier than creating new index.

Take away Index From a Pandas Dataframe

To take away index from a pandas dataframe, you should utilize the reset_index() technique. The reset_index() technique, when invoked on a dataframe, returns a brand new dataframe with none index column. If the prevailing index is a selected column, the column is once more transformed to a standard column as proven under.

myDf=pd.read_csv("samplefile.csv",index_col=[0,1])
print("The dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)
myDf=myDf.reset_index()
print("The modified dataframe is:")
print(myDf)
print("The index is:")
index=checklist(myDf.index)
print(index)

Output:

The dataframe is:
                Identify
Class Roll          
1     11      Aditya
      12       Chris
      13         Sam
2     1         Joel
      22         Tom
      44    Samantha
3     33        Tina
      34         Amy
The index is:
[(1, 11), (1, 12), (1, 13), (2, 1), (2, 22), (2, 44), (3, 33), (3, 34)]
The modified dataframe is:
   Class  Roll      Identify
0      1    11    Aditya
1      1    12     Chris
2      1    13       Sam
3      2     1      Joel
4      2    22       Tom
5      2    44  Samantha
6      3    33      Tina
7      3    34       Amy
The index is:
[0, 1, 2, 3, 4, 5, 6, 7]

Conclusion

On this article, we have now mentioned the best way to create pandas dataframe index. Moreover, we have now additionally created multilevel indices and learnt the best way to take away index from a pandas dataframe. To be taught extra about python programming, you’ll be able to learn this text on checklist comprehension in Python. If you’re into machine studying, you’ll be able to learn this text on common expressions in machine studying.

Keep tuned for extra informative articles.

Completely satisfied Studying!

Really helpful Python Coaching

Course: Python 3 For Newcomers

Over 15 hours of video content material with guided instruction for learners. Discover ways to create actual world purposes and grasp the fundamentals.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments