How do you delete a dataframe in python?

By using pandas.DataFrame.drop() method you can drop/remove/delete rows from DataFrame. axis param is used to specify what axis you would like to remove. By default axis = 0 meaning to remove rows. Use axis=1 or columns param to remove columns. By default, pandas return a copy DataFrame after deleting rows, use inpalce=True to remove from existing referring DataFrame.

Related: Drop DataFrame Rows by Checking Conditions

In this article, I will cover how to remove rows by labels, by indexes, by ranges and how to drop inplace and None, Nan & Null values with examples. if you have duplicate rows, use drop_duplicates() to drop duplicate rows from pandas DataFrame

1. Pandas.DataFrame.drop() Syntax – Drop Rows & Columns

# pandas DaraFrame drop() Syntax DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

labels – Single label or list-like. It’s used with axis param.
axis – Default set’s to 0. 1 to drop columns and 0 to drop rows.
index – Use to specify rows. Accepts single label or list-like.
columns – Use to specify columns. Accepts single label or list-like.
level – int or level name, optional, use for Multiindex.
inplace – Default False, returns a copy of DataFrame. When used True, it drop’s column inplace (current DataFrame) and returns None.
errors – {‘ignore’, ‘raise’}, default ‘raise’

Let’s create a DataFrame, run some examples and explore the output. Note that our DataFrame contains index labels for rows which I am going to use to demonstrate removing rows by labels.

import pandas as pd import numpy as np technologies = { 'Courses':["Spark","PySpark","Hadoop","Python"], 'Fee' :[20000,25000,26000,22000], 'Duration':['30day','40days',np.nan, None], 'Discount':[1000,2300,1500,1200] } indexes=['r1','r2','r3','r4'] df = pd.DataFrame(technologies,index=indexes) print(df)

Yields below output

Courses Fee Duration Discount r1 Spark 20000 30day 1000 r2 PySpark 25000 40days 2300 r3 Hadoop 26000 NaN 1500 r4 Python 22000 None 1200

By default drop() method removes rows (axis=0) from DataFrame. Let’s see several examples of how to remove rows from DataFrame.

2.1 Drop rows by Index Labels or Names

One of the pandas advantages is you can assign labels/names to rows, similar to column names. If you have DataFrame with row labels (index labels), you can specify what rows you wanted to remove by label names.

# Drop rows by Index Label df = pd.DataFrame(technologies,index=indexes) df1 = df.drop(['r1','r2']) print(df1)

Yields below output.

Courses Fee Duration Discount r3 Hadoop 26000 NaN 1500 r4 Python 22000 None 1200

Alternatively, you can also write the same statement by using the field name 'index'.

# Delete Rows by Index Labels df1 = df.drop(index=['r1','r2'])

And by using labels and axis as below.

# Delete Rows by Index Labels & axis df1 = df.drop(labels=['r1','r2']) df1 = df.drop(labels=['r1','r2'],axis=0)

Notes:

As you see using labels, axis=0 is equivalent to using index=label names.
axis=0 mean rows. By default drop() method considers axis=0 hence you don’t have to specify to remove rows. to remove columns explicitly specify axis=1 or columns.

2.2 Drop Rows by Index Number (Row Number)

Similarly by using drop() method you can also remove rows by index position from pandas DataFrame. drop() method doesn’t have position index as a param, hence we need to get the row labels from the index and pass these to the drop method. We will use df.index to get us row labels for the indexes we wanted to delete.

df.index.values returns all row labels as list.
df.index[[1,3]] get’s you row labels for 2nd and 3rd rows, by passing these to drop() method removes these rows. Note that in python list index starts from zero.

# Delete Rows by Index numbers df = pd.DataFrame(technologies,index=indexes) df1=df.drop(df.index[[1,3]]) print(df1)

Yields the same output as section 2.1. In order to remove the first row, you can use df.drop(df.index[0]), and to remove the last row use df.drop(df.index[-1]).

# Removes First Row df=df.drop(df.index[0]) # Removes Last Row df=df.drop(df.index[-1])

2.3 Delete Rows by Index Range

You can also remove rows by specifying the index range. The below example removes all rows starting 3rd row.

# Delete Rows by Index Range df = pd.DataFrame(technologies,index=indexes) df1=df.drop(df.index[2:]) print(df1)

Yields below output.

Courses Fee Duration Discount r1 Spark 20000 30day 1000 r2 PySpark 25000 40days 2300

2.4 Delete Rows when you have Default Indexs

By default pandas assign a sequence number to all rows also called index, row index starts from zero and increments by 1 for every row. If you are not using custom index labels then pandas DataFrame assigns sequence numbers as Index. To remove rows with the default index, you can try below.

# Remove rows when you have default index. df = pd.DataFrame(technologies) df1 = df.drop(0) df3 = df.drop([0, 3]) df4 = df.drop(range(0,2))

Note that df.drop(-1) doesn’t remove the last row as -1 index not present in DataFrame. You can still use df.drop(df.index[-1]) to remove the last row.

2.5 Remove DataFrame Rows inplace

All examples you have seen above return a copy DataFrame after removing rows. In case if you wanted to remove rows inplace from referring DataFrame use inplace=True. By default inplace param is set to False.

# Delete Rows inplace df = pd.DataFrame(technologies,index=indexes) df.drop(['r1','r2'],inplace=True) print(df)

2.6 Drop Rows by Checking Conditions

Most of the time we would also need to remove DataFrame rows based on some conditions (column value), you can do this by using loc[] and iloc[] methods.

# Delete Rows by Checking Conditions df = pd.DataFrame(technologies) df1 = df.loc[df["Discount"] >=1500 ] print(df1)

Yields below output.

Courses Fee Duration Discount 1 PySpark 25000 40days 2300 2 Hadoop 26000 NaN 1500

2.7 Drop Rows that has NaN/None/Null Values

While working with analytics you would often be required to clean up the data that has None, Null & np.NaN values. By using df.dropna() you can remove NaN values from DataFrame.

# Delete rows with Nan, None & Null Values df = pd.DataFrame(technologies,index=indexes) df2=df.dropna() print(df2)

This removes all rows that have None, Null & NaN values on any columns.

Courses Fee Duration Discount r1 Spark 20000 30day 1000 r2 PySpark 25000 40days 2300

2.8 Remove Rows by Slicing DataFrame

You can also remove DataFrame rows by slicing. Remember index starts from zero.

df2=df[4:] # Returns rows from 4th row df2=df[1:-1] # Removes first and last row df2=df[2:4] # Return rows between 2 and 4

You can also remove first N rows from pandas DataFrame and remove last N Rows from pands DataFrame

Happy Learning !!

Conclusion

In this pandas drop rows article you have learned how to drop/remove pandas DataFrame rows using drop() method. By default drop() deletes rows (axis = 0), if you wanted to delete columns either you have to use axis =1 or columns=labels param.

References

//chrisalbon.com/code/python/data_wrangling/pandas_dropping_column_and_rows/

How do you delete a data frame in Python?

Pandas DataFrame drop() Method The drop() method removes the specified row or column. By specifying the column axis ( axis='columns' ), the drop() method removes the specified column. By specifying the row axis ( axis='index' ), the drop() method removes the specified row.

How do I delete DataFrame data in pandas?

To delete a row from a DataFrame, use the drop() method and set the index label as the parameter.

How do you delete DataFrame entries?

drop() method you can drop/remove/delete rows from DataFrame. axis param is used to specify what axis you would like to remove. By default axis = 0 meaning to remove rows. Use axis=1 or columns param to remove columns.

How do you delete all data from a DataFrame in Python?

Delete rows and columns from a DataFrame using Pandas drop().

Delete a single row..

Delete multiple rows..

Delete rows based on row position and custom range..

Delete a single column..

Delete multiple columns..

Delete columns based on column position and custom range..

Working with MultiIndex DataFrame..

How do you delete a dataframe in python?

1. Pandas.DataFrame.drop() Syntax – Drop Rows & Columns

2.1 Drop rows by Index Labels or Names

2.2 Drop Rows by Index Number (Row Number)

2.3 Delete Rows by Index Range

2.4 Delete Rows when you have Default Indexs

2.5 Remove DataFrame Rows inplace

2.6 Drop Rows by Checking Conditions

2.7 Drop Rows that has NaN/None/Null Values

2.8 Remove Rows by Slicing DataFrame

Conclusion

Also Read

References

How do you delete a data frame in Python?

How do I delete DataFrame data in pandas?

How do you delete DataFrame entries?

How do you delete all data from a DataFrame in Python?

Pos Terkait

Cara menggunakan localhost:8080/phpmyadmin

Cara menggunakan javascript first child of-type

How to enable textbox in javascript on button click

Cara menggunakan php artisan migrate adalah

Cara menggunakan sscanf php

Cara menggunakan docker-compose mysql

Is python case sensitive when dealing with identifiers mcq

Penggunaan fungsi SHUFFKE pada PHP

Background opacity css without affecting text

Apa saja tag dasar yang ada pada html?

Toplist

Top 9 contoh eksposisi yang dapat dimanfaatkan untuk mempromosikan dan menjual barang, jasa dan aktivitas 2022

Top 10 hal apa yang sudah kalian lakukan sebagai bentuk penghargaan terhadap hak asasi manusia 2022

Top 10 wirupa merupakan unsur dalam tari yang memberikan kejelasan tentang 2022

Top 10 perusahaan yang melakukan pengolahan bahan baku menjadi bahan jadi disebut dengan perusahaan 2022

Top 10 yang bukan termasuk langkah-langkah dalam merancang sebuah wirausahawan adalah 2022

Top 1 jika titik l -8, 10 ditranslasikan terhadap titik t(52), maka akan terbentuk bayangan di titik 2022

Top 8 apa itu benedict dan biuret? 2022

Top 9 bagaimana cara kamu menyikapi semua perbedaan yang dimiliki oleh teman-temanmu 2022

Top 9 sebuah prisma segitiga abcdef dengan ukuran lihat gambar luas permukaan prisma tersebut adalah cm2 2022

Postingan terbaru

LIHAT SEMUA