How do i select a column in a list python?

Use DataFrame.loc[] and DataFrame.iloc[] to select a single column or multiple columns from pandas DataFrame by column names/label or index position respectively. where loc[] is used with column labels/names and iloc[] is used with column index/position. You can also use these operators to select rows from pandas DataFrame. Also, refer to a related article how to get cell value from pandas DataFrame.

Pandas DataFrame is a two-dimensional tabular data structure with labeled axes. i.e. columns and rows. Selecting columns from DataFrame results in a new DataFrame containing only specified selected columns from the original DataFrame.

In this article, I will explain how to select single or multiple columns from DataFrame by column labels & index, certain positions of the column, and by range e.t.c with examples.

1. Quick Examples of Select Columns from Pandas DataFrame

If you are in a hurry, below are some quick examples of how to select single or multiple columns from pandas DataFrame by column name and index.

# Below are quick example # By using df[] Notation df2 = df[["Courses","Fee","Duration"]] # select multile columns # Using loc[] to take column slices df2 = df.loc[:, ["Courses","Fee","Duration"]] # Selecte multiple columns df2 = df.loc[:, ["Courses","Fee","Discount"]] # Select Random columns df2 = df.loc[:,'Fee':'Discount'] # Select columns between two columns df2 = df.loc[:,'Duration':] # Select columns by range df2 = df.loc[:,:'Duration'] # Select columns by range df2 = df.loc[:,::2] # Select every alternate column # Using iloc[] to select column by Index df2 = df.iloc[:,[1,3,4]] # Select columns by Index df2 = df.iloc[:,1:4] # Select between indexes 1 and 4 (2,3,4) df2 = df.iloc[:,2:] # Select From 3rd to end df2 = df.iloc[:,:2] # Select First Two Columns

Now, let’s create a DataFrame with a few rows and columns and execute some examples of how to select columns in pandas. Our DataFrame contains column names Courses, Fee, Duration, and Discount.

import pandas as pd technologies = { 'Courses':["Spark","PySpark"], 'Fee' :[20000,25000], 'Duration':['30days','40days'], 'Discount':[1000,2300] } df = pd.DataFrame(technologies) print(df)

Yields below output.

Courses Fee Duration Discount Tutor 0 Spark 20000 30days 1000 Michel 1 PySpark 25000 40days 2300 Sam

2. Using loc[] to Select Columns by Name

By using pandas.DataFrame.loc[] you can select columns by names or labels. To select the columns by names, the syntax is df.loc[:,start:stop:step]; where start is the name of the first column to take, stop is the name of the last column to take, and step as the number of indices to advance after each extraction; for example, you can select alternate columns. Or, use the syntax: [:,[labels]] with labels as a list of column names to take.

#loc[] syntax to slice columns df.loc[:,start:stop:step]

2.1 Select DataFrame Columns by Name

To select single or multiple columns by labels or names, all you need is to provide the names of the columns as a list. Here we use the [] notation instead of df.loc[,start:stop:step] approach.

# Select Columns by labels df2 = df[["Courses","Fee","Duration"]] #Returns # Courses Fee Duration #0 Spark 20000 30days #1 PySpark 25000 40days

2.2 Select Multiple Columns

Sometimes you may want to select multiple columns from pandas DataFrame, you can do this by passing multiple column names/labels as a list. Note that loc[] also supports multiple conditions when selecting rows based on column values.

# Select Multiple Columns df2 = df.loc[:, ["Courses","Fee","Discount"]] #Returns # Courses Fee Discount #0 Spark 20000 1000 #1 PySpark 25000 2300

2.3 Select DataFrame Columns by Range

When you wanted to select columns by the range, provide start and stop column names.

  • By not providing a start column, loc[] selects from the beginning.
  • By not providing stop, loc[] selects all columns from the start label.
  • Providing both start and stop, selects all columns in between.
# Select all columns between Fee an Discount columns df2 = df.loc[:,'Fee':'Discount'] #Returns # Fee Duration Discount #0 20000 30days 1000 #1 25000 40days 2300 # Select from 'Duration' column df2 = df.loc[:,'Duration':] #Returns # Duration Discount Tutor #0 30days 1000 Michel #1 40days 2300 Sam # Select from beginning and end at 'Duration' column df2 = df.loc[:,:'Duration'] #Returns # Courses Fee Duration #0 Spark 20000 30days #1 PySpark 25000 40days

2.4 Select Every Alternate Column

Using loc[], you can also select every other column from pandas DataFrame.

# Select every alternate column df2 = df.loc[:,::2] #Returns # Courses Duration Tutor #0 Spark 30days Michel #1 PySpark 40days Sam

3. Pandas iloc[] to Select Column by Index or Position

By using pandas.DataFrame.iloc[] you can select columns from DataFrame by position/index. ; Remember index starts from 0. You can use iloc[] with the syntax [:,start:stop:step] where start indicates the index of the first column to take, stop indicates the index of the last column to take, and step indicates the number of indices to advance after each extraction. Or, use the syntax: [:,[indices]] with indices as a list of column indices to take.

3.1. Select Multiple Columns by Index Position

Below example retrieves "Fee","Discount" and "Duration" and returns a new DataFrame with the columns selected.

# Selected by column position df2 = df.iloc[:,[1,3,4]] #Returns # Fee Discount Tutor #0 20000 1000 Michel #1 25000 2300 Sam

3.2 Select Columns by Position Range

You can also slice a DataFrame by a range of positions.

# Select between indexes 1 and 4 (2,3,4) df2 = df.iloc[:,1:4] #Returns # Fee Duration Discount #0 20000 30days 1000 #1 25000 40days 2300 # Select From 3rd to end df2 = df.iloc[:,2:] #Returns # Duration Discount Tutor #0 30days 1000 Michel #1 40days 2300 Sam # Select First Two Columns df2 = df.iloc[:,:2] #Returns # Courses Fee #0 Spark 20000 #1 PySpark 25000

To get the last column use df.iloc[:,-1:] and to get just first column df.iloc[:,:1]

4. Complete Example of pandas Select Columns

Below is a complete example of how to select columns from pandas DataFrame.

import pandas as pd technologies = { 'Courses':["Spark","PySpark"], 'Fee' :[20000,25000], 'Duration':['30days','40days'], 'Discount':[1000,2300], 'Tutor':['Michel','Sam'] } df = pd.DataFrame(technologies) print(df) # Select multiple columns print(df[["Courses","Fee","Duration"]]) # Select Random columns print(df.loc[:, ["Courses","Fee","Discount"]]) # Select columns by range print(df.loc[:,'Fee':'Discount']) print(df.loc[:,'Duration':]) print(df.loc[:,:'Duration']) # Select every alternate column print(df.loc[:,::2]) # Selected by column position print(df.iloc[:,[1,3,4]]) # Select between indexes 1 and 4 (2,3,4) print(df.iloc[:,1:4]) # Select From 3rd to end print(df.iloc[:,2:]) # Select First Two Columns print(df.iloc[:,:2])

Conclusion

In this article, you have learned how to select single or multiple columns from pandas DataFrame using DataFrame.loc[], and DataFrame.iloc[] properties. To understand the similarities and differences of these two refer to pandas loc vs iloc.

Happy Learning !!

You May Also Like

  • How to Add an Empty Column to a Pandas DataFrame
  • How to Combine Two Series into pandas DataFrame
  • Install pandas on Windows Step-by-Step
  • Convert Index to Column in Pandas DataFrame
  • Replace NaN Values with Zeroes in a Column of a Pandas DataFrame

References

  • //pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html

How do I select a column in a list in Python?

This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.

How do I extract a column in Python?

Extracting Multiple columns from dataframe.
Syntax : variable_name = dataframe_name [ row(s) , column(s) ].
Example 1: a=df[ c(1,2) , c(1,2) ].
Explanation : if we want to extract multiple rows and columns we can use c() with row names and column names as parameters. ... .
Example 2 : b=df [ c(1,2) , c(“id”,”name”) ].

How do I extract only certain columns in Python?

extract one column from dataframe python.
import pandas as pd..
input_file = "C:\\....\\consumer_complaints.csv".
dataset = pd. read_csv(input_file).
df = pd. DataFrame(dataset).
cols = [1,2,3,4].
df = df[df. columns[cols]].

How do I select a column in pandas Python?

There are three basic methods you can use to select multiple columns of a pandas DataFrame:.
Method 1: Select Columns by Index df_new = df. iloc[:, [0,1,3]].
Method 2: Select Columns in Index Range df_new = df. iloc[:, 0:3].
Method 3: Select Columns by Name df_new = df[['col1', 'col2']].

Postingan terbaru

LIHAT SEMUA