In this article, we’re going to explore how to connect a dataset from a Python script to a SQL Server database. Then, I’ll explain how to export a dataset from Python to SQL Server. After that, I’ll explain how to use Python to query data from the newly created table in SQL Server.
Use Cases
First, here are three reasons why you may want to use Python to export datasets to SQL Server:
- You might want to transform the data in Python before exporting the data to SQL Server
- You might want to join datasets together in a Python script before exporting that data to SQL Server
- You might want to schedule exports on a daily or weekly basis, probably with a program like Windows Scheduler that would run Python scripts.
Creating an ODBC Data Source
- As a basic requirement, make sure Python and SQL Server are installed on your computer. I’m using Jupyter Notebook for Python and SQL Server Developer edition.
- Next, you’ll need to establish an ODBC Data Source. Search for “ODBC.”
- Choose “ODBC Data Sources (64 bit).”
- Go to the “System DSN” tab.
- Press “Add.”
- Choose “ODBC Driver for SQL Server”
- Next, choose a name for your data source. It can be something as simple as “SQLServerConnection.”
- At the Server option, input the name of your SQL server instance. You can copy the server name from SQL Server and paste it here.
- Next, it’s going to ask you how SQL Server should verify the authentication of the login ID? Choose with SQL Server authentication using a login ID and password.
- Enter a Login ID and password associated with your SQL Server instance.
- At this next screen, you can choose which database that you want to be associated with this ODBC connection. For this demo, I’ll leave it on “master.” Leave everything else default.
- At the next screen, leave all of these settings default as well.
- Click finish
- Confirm that the new ODBC data source has been created.
Exporting a Dataset from Python
- Let’s begin in a Jupyter Notebook.
- Import two python packages:
- import sqlalchemy
- import pyodbc
- As you can see, I imported a dataset from a csv file on my desktop.
- Here is how the dataset looks
- After that, write this statement to establish the data source connection
- Export the dataset to SQL Server using this line of code
# imports pandas, a package for data manipulation and analysis
import pandas as pd
# imports pyodbc, an open source Python module that makes accessing ODBC databases simple
import pyodbc
# imports sqlachemy, a SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL
import sqlalchemy
# imports a sample Super Store dataset provided by Tableau
SuperStoreData = pd.read_csv(r’C:\Users\timen\OneDrive\AnalyticsExplained.com\Video Projects\10 Types of Window Functions\Sample – Superstore.csv’)
SuperStoreData.head(5)
# sets up an engine for an SQLAlchemy application, defining how to connect with a database
engine = sqlalchemy.create_engine(“mssql+pyodbc://enallst:Space72446@SQLServerConnection”)
# exports the dataset to a table called “SuperStoreDataFromPython” under the database associated with the ODBC connection
SuperStoreData.to_sql(“SuperStoreDataFromPython”, con=engine, if_exists=’replace’, index_label=’index’)
View the New Table in SQL Server
Next, go to SQL Server, refresh the database, and as you can see, the table I just exported from the Python script is now visible.
Querying Data from SQL Server
# queries data from another table containing the sample Super Store data
query1 = “SELECT * FROM [SuperStoreSampleData].[dbo].[SuperStoreDataFull]”
SuperStoreData = pd.read_sql(query1, engine)
SuperStoreData.head()
# another query from a table containing the sample Super Store data; a more proper SQL code layout is shown
query2 = “””
SELECT *
FROM [SuperStoreSampleData].[dbo].[SuperStoreDataFull]
WHERE [Ship Status] = ‘Shipped Late’ AND Profit > ‘6’
“””
SuperStoreData2 = pd.read_sql(query2, engine)
SuperStoreData2.head()
URLs to recommended courses are shown below:
Programming for Data Science with Python Nanodegree (Udacity):
Become a Data Analyst Nanodegree (Udacity):
Python for Data Science and Machine Learning Bootcamp (Udemy):
Python A-Z™: Python For Data Science With Real Exercises! (Udemy):
Sources
- https://en.wikipedia.org/wiki/Open_Database_Connectivity
- https://docs.microsoft.com/en-us/sql/integration-services/import-export-data/connect-to-an-odbc-data-source-sql-server-import-and-export-wizard?view=sql-server-ver15
- https://www.dundas.com/support/learning/documentation/connect-to-data/how-to/connecting-to-odbc#:~:text=A%20System%20DSN%20is%20a,Data%20Sources%20(ODBC)%20shortcut.
- https://www.sciencedirect.com/topics/computer-science/master-database
- https://docs.sqlalchemy.org/en/13/core/engines.html
- https://docs.sqlalchemy.org/en/13/glossary.html#term-DBAPI