How do you count variables in python?

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    Python List count() method returns the count of how many times a given object occurs in a List.

    Python List count() method Syntax

    Syntax:  list_name.count(object) 

    Parameters: 

    • object: is the item whose count is to be returned.

    Returns:  Returns the count of how many times object occurs in the list.

    Exception: 

    • TypeError: Raises TypeError If more than 1 parameter is passed in count() method.

    Python List count() method Example

    Python3

    list2 = ['a', 'a', 'a', 'b', 'b', 'a', 'c', 'b']

    print(list2.count('b'))

    Output:

    3

    Example 1: Count Tuple and List Elements Inside List

    Count occurrences of List and Tuples inside a list using count() method.

    Python3

    list1 = [ ('Cat', 'Bat'), ('Sat', 'Cat'), ('Cat', 'Bat'),

              ('Cat', 'Bat', 'Sat'), [1, 2], [1, 2, 3], [1, 2] ]

    print(list1.count(('Cat', 'Bat')))

    print(list1.count([1, 2]))

    Output: 

    2
    2

    Exceptions while using Python list count() method

    TypeError

    Python list count() method raises TypeError when more than 1 parameter is passed.

    Python3

    list1 = [1, 1, 1, 2, 3, 2, 1]

    print(list1.count(1, 2))

    Output:

    Traceback (most recent call last):
      File "/home/41d2d7646b4b549b399b0dfe29e38c53.py", line 7, in 
        print(list1.count(1, 2))  
    TypeError: count() takes exactly one argument (2 given)

    Practical Application

     Let’s say we want to count each element in a list and store it in another list or say dictionary.

    Python3

    lst = ['Cat', 'Bat', 'Sat', 'Cat', 'Mat', 'Cat', 'Sat']

    print ([ [l, lst.count(l)] for l in set(lst)])

    print (dict( (l, lst.count(l) ) for l in set(lst)))

    Output:

    [['Mat', 1], ['Cat', 3], ['Sat', 2], ['Bat', 1]]
    {'Bat': 1, 'Cat': 3, 'Sat': 2, 'Mat': 1}

    You are here: Home / Pandas 101 / Pandas value_counts: How To Get Counts of Unique Variables in a Dataframe?

    As part of exploring a new data, often you might want to count unique values of one or more columns in a dataframe. Pandas value_counts() can get counts of unique values of columns in a Pandas dataframe. Starting from Pandas version 1.1.0, we can use value_counts() on a Pandas Series and dataframe as well.

    In this tutorial, we will see examples of using Pandas value_counts on a single variable in a dataframe (i.e. Series value_counts()) first and then see how to use value_counts on a dataframe, i.e. with multiple variables.

    Let us load Pandas and check the version of Pandas. With Pandas version 1.1.0 and above we can use value_counts() on Pandas dataframe and Series.

    import pandas as pd
    import numpy as np
    pd.__version__
    1.1.0
    

    We will be using Penguins data set to count values for one or more variables. Let us load the data directly from github page.

    data_url = "https://raw.githubusercontent.com/cmdlinetips/data/master/palmer_penguins.csv"
    df = pd.read_csv(data_url)
    df.head()
    species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex
    0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	male
    1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	female
    2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	female
    3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN
    4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	female
    

    For our examples illustrating value_counts(), we will mainly focus first on categorical variables and get their counts/frequency.

    Pandas value_counts() on a single column

    With Pandas value_counts() function we can compute the frequency of a variable from dataframe as shown below. In the example below, we are interested in “island” column and ask what are the counts of each unique island in the dataset. We can see that there are three different islands in the data and also count/frequency of each of them.

    df.island.value_counts()
    Biscoe       168
    Dream        124
    Torgersen     52
    Name: island, dtype: int64
    

    By default, the resulting Series object from Pandas value_counts()

    will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.

    Here is another example of using value_counts() on another categorical variable and their counts.

    df.species.value_counts()
    
    Adelie       152
    Gentoo       124
    Chinstrap     68
    Name: species, dtype: int64
    

    Pandas’ value_counts() to get proportion

    By using normalize=True argument to Pandas value_counts() function, we can get the proportion of each value of the variable instead of the counts.

    df.species.value_counts(normalize = True)
    

    We can see that the resulting Series has relative frequencies of the unique values.

    Adelie       0.441860
    Gentoo       0.360465
    Chinstrap    0.197674
    Name: species, dtype: float64
    

    How to convert value_counts() output into a data frame

    Pandas’ value_counts() returns a Series object. Here is a way to convert the output Series from value_counts() into a Pandas Data Frame. We can use reset_index() function and a get data frame.

    df.species.value_counts().reset_index()
    
    index	species
    0	Adelie	152
    1	Gentoo	124
    2	Chinstrap	68
    

    Pandas reset_index() function uses the variable name for the counts. Often you would like to rename column names to more suitable names. Here we assign new column names using columns method on the dataframe.

    df_species_counts = df.species.value_counts().reset_index()
    df_species_counts.columns = ["island", "n"]
    

    Now the results from value_counts() is a dataframe with suitable column names.

    df_species_counts
    
    island	n
    0	Biscoe	168
    1	Dream	124
    2	Torgersen	52
    

    Pandas value_counts() on multiple columns (or on a dataframe)

    Sometimes you might want to tabulate counts of multiple variables. With Pandas version 1.1.0 and above we can use Pandas’ value_coiunts() function to get counts for multiple variable. For example, if we want to know the counts of each island and species combination, we can use value_counts as follows.

    df[["island", "species"]].value_counts()
    

    Pandas value_counts() on dataframe gives the result as multi-index Series.

    island     species  
    Biscoe     Gentoo       124
    Dream      Chinstrap     68
               Adelie        56
    Torgersen  Adelie        52
    Biscoe     Adelie        44
    dtype: int64
    

    We can also use Pandas to_markdown() fucntion to nicely tabulate the results from value_counts()

    print(df[["island", "species"]].value_counts().to_markdown())
    
    |                         |   0 |
    |:------------------------|----:|
    | ('Biscoe', 'Gentoo')    | 124 |
    | ('Dream', 'Chinstrap')  |  68 |
    | ('Dream', 'Adelie')     |  56 |
    | ('Torgersen', 'Adelie') |  52 |
    | ('Biscoe', 'Adelie')    |  44 |
    

    If you want the results from value_counts() as a simple data frame, not multi-indexed one, we can use reset_index() function to get a simple dataframe.

    df[["island", "species"]].value_counts(sort=False).reset_index()
    island	species	0
    0	Biscoe	Adelie	44
    1	Biscoe	Gentoo	124
    2	Dream	Adelie	56
    3	Dream	Chinstrap	68
    4	Torgersen	Adelie	52
    

    Another useful feature of value_counts() is that we can get normalized values of counts as proportions, the relative frequency by dividing each value by the sum of all values., using normalize=True argument.

    df[["island", "species"]].value_counts(normalize=True)
    island     species  
    Biscoe     Gentoo       0.360465
    Dream      Chinstrap    0.197674
               Adelie       0.162791
    Torgersen  Adelie       0.151163
    Biscoe     Adelie       0.127907
    dtype: float64
    

    If we want the result as a simple dataframe, we can use reset_index() function to convert the multi-indexed dataframe to a simple one.

    df[["island", "species"]].value_counts(normalize=True).reset_index()
    island	species	0
    0	Biscoe	Gentoo	0.360465
    1	Dream	Chinstrap	0.197674
    2	Dream	Adelie	0.162791
    3	Torgersen	Adelie	0.151163
    4	Biscoe	Adelie	0.127907
    

    Pandas value_counts() with groupby()

    If you are using pandas version below 1.1.0 and stil want to compute counts of multiple variables, the solution is to use Pandas groupby function. We basically select the variables of interest from the data frame and use groupby on the variables and compute size. In the example below we have computed the counts for each combination of unique values of two variables.

    df[["island", "species"]].groupby(["island", "species"]).size()
    island     species  
    Biscoe     Adelie        44
               Gentoo       124
    Dream      Adelie        56
               Chinstrap     68
    Torgersen  Adelie        52
    dtype: int64
    

    You can see that we get the same results, but with different order. As both value_counts() and groupby() functions have different sort options.

    What does count () do in Python?

    The count() method returns the number of elements with the specified value.

    How do you count inputs in Python?

    Python String has got an in-built function – string. count() method to count the occurrence of a character or a substring in the particular input string. The string. count() method accepts a character or a substring as an argument and returns the number of times the input substring happens to appear in the string.

    How do you count a variable in a string in Python?

    Python count The count() is a built-in function in Python. It will return the total count of a given element in a string. The counting begins from the start of the string till the end. It is also possible to specify the start and end index from where you want the search to begin.

    How do you write a count function in Python?

    Syntax of count() Function in Python Python Count() function has following syntax: string. count(substring/character, start=, end=)