If you’re in a field related to analytics or data science, you’re always going to be in situations where you need to create new columns in your data so that you can have a greater understanding of it. And most likely, using a software like Excel is way too time-consuming.
So in this article, I’m going to show you how to create columns in Python with functions.
In this example, I’m using a sample data set I created. There are columns for customers, revenues, and growth %.
import pandas as pd
CustomerSampleData = pd.read_excel(r’C:\Users\timen\Documents\Data Sets\Customer Sample Data.xlsx’)
Right, I’m creating a function called IndustryConditions.
A function is a block of code that only runs when its called.
In the parenthesis, is the name of the argument of the function. Information is passed into functions as arguments.
In this function, the name of an industry is returned for each customer.
I’m also creating a new column called “Industries” and I’m applying the function to the column.
In the table, there’s a new column with each of these industries.
if (a[‘Customers’] == ‘Duke Trading Associates’):
return ‘Financial Services’
elif (a[‘Customers’] == ‘Luxury Restaurant Group’):
return ‘Food & Beverage’
elif (a[‘Customers’] == ‘ABC Software’):
elif (a[‘Customers’] == ‘Sterling Energy Corp’):
return ‘Travel & Hospitality’
CustomerSampleData[‘Industries’] = CustomerSampleData.apply(IndustryConditions, axis=1)
In this section, I’m creating a function called RevenueConditions.
In this function, I’m categorizing revenues starting with the highest revenues to the lowest revenues.
I’m applying this function to a new column called “RevenueCategories.”
if (b[‘Revenues’] >= 10000000):
elif (b[‘Revenues’] >= 1000000):
CustomerSampleData[‘RevenueCategories’] = CustomerSampleData.apply(RevenueConditions, axis=1)
Numerical Conditions 2
In this function, I’m determining whether an opportunity is high based on whether revenues are equal or above 1 million and whether the growth percentage is equal or above 20 percent.
I’m applying this function to a new column called “Opportunity.”
if (c[‘Revenues’] >= 1000000) and (c[‘Growth%’] >= .2):
CustomerSampleData[‘Opportunity’] = CustomerSampleData.apply(OpportunityConditions, axis=1)
Code used in this article is here:
URLs to recommended courses are shown below:
Programming for Data Science with Python:
Become a Data Analyst:
Python for Data Science and Machine Learning Bootcamp:
Python A-Z™: Python For Data Science With Real Exercises!: