Read column from excel in python

How do I read the content of a specific column with provided name, given that the FORMAT is configurable?

This is what I tried. Currently I'm able to read all the content in the file

from xlrd import open_workbook
wb = open_workbook('sample.xls')
for s in wb.sheets():
    #print 'Sheet:',s.name
    values = []
    for row in range(s.nrows):
        col_value = []
        for col in range(s.ncols):
            value  = (s.cell(row,col).value)
            try : value = str(int(value))
            except : pass
            col_value.append(value)
        values.append(col_value)
print values

My output is :

[
    [u'Arm_id', u'DSPName', u'DSPCode', u'HubCode', u'PinCode', u'PPTL'],
    ['1', u'JaVAS', '1', u'AGR', '282001', u'1,2'], 
    ['2', u'JaVAS', '1', u'AGR', '282002', u'3,4'], 
    ['3', u'JaVAS', '1', u'AGR', '282003', u'5,6']
]

Then I loop around values[0] trying to find out the FORMAT content in values[0] and then getting the index of Arm_id, DSPname and Pincode in the values[0] and then from next loop I know the index of all the FORMAT factors , thereby getting to know which value do I need to get .

But this is such a poor solution.

How do I get the values of a specific column with name in excel file?

You can easily import an Excel file into Python using Pandas. In order to accomplish this goal, you’ll need to use read_excel.

In this short guide, you’ll see the steps to import an Excel file into Python using a simple example.

But before we start, here is a template that you may use in Python to import your Excel file:

import pandas as pd

df = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx')
print (df)

Note that for an earlier version of Excel, you may need to use the file extension of ‘xls’

And if you have a specific Excel sheet that you’d like to import, you may then apply:

import pandas as pd

df = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx', sheet_name='your Excel sheet name')
print (df)

Let’s now review an example that includes the data to be imported into Python.

The Data to be Imported into Python

Suppose that you have the following table stored in Excel (where the Excel file name is ‘Product List‘):

Product Price
Desktop Computer 700
Tablet 250
Printer 120
Laptop 1200

How would you then import the above data into Python?

You may follow the steps below to import an Excel file into Python.

Step 1: Capture the file path

First, you’ll need to capture the full path where the Excel file is stored on your computer.

For example, let’s suppose that an Excel file is stored under the following path:

C:\Users\Ron\Desktop\Product List.xlsx

In the Python code, to be provided below, you’ll need to modify the path name to reflect the location where the Excel file is stored on your computer.

Don’t forget to include the file name (in our example, it’s ‘Product list‘ as highlighted in blue). You’ll also need to include the Excel file extension (in our case, it’s ‘.xlsx‘ as highlighted in green).

Step 2: Apply the Python code

And here is the Python code tailored to our example. Additional notes are included within the code to clarify some of the components used.

import pandas as pd

df = pd.read_excel (r'C:\Users\Ron\Desktop\Product List.xlsx') #place "r" before the path string to address special character, such as '\'. Don't forget to put the file name at the end of the path + '.xlsx'
print (df)

Step 3: Run the Python code to import the Excel file

Run the Python code (adjusted to your path), and you’ll get the following dataset:

            Product  Price
0  Desktop Computer    700
1            Tablet    250
2           Printer    120
3            Laptop   1200

Notice that we got the same results as those that were stored in the Excel file.

Note: you will have to install an additional package if you get the following error when running the code:

ImportError: Missing optional dependency ‘xlrd’

You may then use the PIP install approach to install openpyxl for .xlsx files:

pip install openpyxl

Optional Step: Selecting subset of columns

Now what if you want to select a specific column or columns from the Excel file?

For example, what if you want to select only the Product column? If that’s the case, you can specify this column name as captured below:

import pandas as pd

data = pd.read_excel (r'C:\Users\Ron\Desktop\Product List.xlsx') 
df = pd.DataFrame(data, columns= ['Product'])
print (df)

Run the code (after adjusting the file path), and you’ll get only the Product column:

            Product
0  Desktop Computer
1            Tablet
2           Printer
3            Laptop

You can specify additional columns by separating their names using a comma, so if you want to include both the Product and Price columns, you can use this syntax:

import pandas as pd

data = pd.read_excel (r'C:\Users\Ron\Desktop\Product List.xlsx') 
df = pd.DataFrame(data, columns= ['Product','Price'])
print (df)

You’ll need to make sure that the column names specified in the code exactly match with the column names within the Excel file. Otherwise, you’ll get NaN values.

Conclusion

You just saw how to import an Excel file into Python using Pandas.

At times, you may need to import a CSV file into Python. If that’s the case, you may want to check the following tutorial that explains how to import a CSV file into Python using Pandas.

You may also check the Pandas Documentation to find out more about the different options that you may apply in regards to read_excel.

How do you read a Excel column in Python?

We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.

How extract specific column from Excel in Python?

pandas read excel certain columns.
import pandas as pd..
import numpy as np..
file_loc = "path.xlsx".
df = pd. read_excel(file_loc, index_col=None, na_values=['NA'], usecols = "A,C:AA").
print(df).

How read data from Excel cell in Python?

# Import the xlrd module..
import xlrd..
# Define the location of the file..
loc = ("path of file").
# To open the Workbook..
wb = xlrd.open_workbook(loc).
sheet = wb.sheet_by_index(0).
# For row 0 and column 0..

How do I read two columns in Excel in Python?

If value in Column X = abc then it should create a file (if not existing already) in some path with name abc. txt and insert the value of column Z in abc. txt file, likewise if Column X = xyz then it should create a file in same path with xyz. txt and insert the value of column Z in xyz.