In this post I will show you a simple python program that import a CVS file from a remote PC into a table of Amazon RDS PostgreSQL using a a SSH tunnel created by the same Python program.
I going to use an Amazon RDS PostgreSQL without a Public IP, for this reason I will use an Amazon EC2 instance as bastion host (Jump server).
Both Amazon RDS PostgreSQL and EC2 Instance use the same VPC (172.31.0.0/16) and the EC2 Instance can connect to Amazon RDS PostgreSQL using the Port 5432 and the python program will connect remotely to EC2 instance using a SSH tunnel.
Current Environment
Instance ID/ Db Instance ID
Local IP
Public IP
Secure Group
Enabled Port
EC2 Instance
i-032f4a47077b4a2ad
172.31.82.38
3.85.36.103
SECURITY_TO_EC2_POSTGRESQL
22
Amazon Postg2ress RDS
postmig
172.31.17.227
Not enabled
ACCESS_TO_RDS_POSTGRESQL
5432
Local Workstation
127.0.0.1
x1.x2.x3.x4
5439
The local PC run a python program (version 3.75) that executes three things:
Create a SSH Tunnel from the laptop with final destination the Amazon RDS PostgreSQL
Create a connection to Amazon RDS PostgreSQL using the SSH tunnel.
Import a CSV file into Amazon RDS PostgreSQL.
1. I going to validate the rules used in Security Groups for the EC2 instance and Amazon RDS using AWS CLI 2.0.10.
a. Validate what is the active Security Group used by the EC2 Instance (i-032f4a47077b4a2ad)
d. Validate the Security Group (Group ID sg-0d30f4aeed419af97) used by Amazon RDS DB only grant access trough the port 5432 to the EC2 Instance (located in the same VPC)
2. Check if the packages psycopg2 and sshtunnel were installed in python, if they were not , install them using pip(Python Package Installer)
C:\Users\user>pip freeze|findstr sshtunnel
C:\Users\user>pip freeze|findstr psycopg2
C:\Users\user>pip install psycopg2
Collecting psycopg2
Using cached https://files.pythonhosted.org/packages/83/8d/bbb2ca983f3939066e8d104fe7a7b0fce1fd3d0f706ddb6d8a86bb33f5d
a/psycopg2-2.8.5-cp37-cp37m-win_amd64.whl
Installing collected packages: psycopg2
Successfully installed psycopg2-2.8.5
WARNING: You are using pip version 19.2.3, however version 20.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
C:\Users\user>pip install sshtunnel
Collecting sshtunnel
Using cached https://files.pythonhosted.org/packages/c5/5c/4b320d7ec4b0d5d4d6df1fdf66a5799625b3623d0ce4efe81719c6f8dfb
3/sshtunnel-0.1.5.tar.gz
Requirement already satisfied: paramiko>=1.15.2 in c:\users\user\appdata\local\programs\python\python37\lib\site-package
s (from sshtunnel) (2.7.1)
Requirement already satisfied: pynacl>=1.0.1 in c:\users\user\appdata\local\programs\python\python37\lib\site-packages (
from paramiko>=1.15.2->sshtunnel) (1.3.0)
Requirement already satisfied: bcrypt>=3.1.3 in c:\users\user\appdata\local\programs\python\python37\lib\site-packages (
from paramiko>=1.15.2->sshtunnel) (3.1.7)
Requirement already satisfied: cryptography>=2.5 in c:\users\user\appdata\local\programs\python\python37\lib\site-packag
es (from paramiko>=1.15.2->sshtunnel) (2.9.2)
Requirement already satisfied: six in c:\users\user\appdata\local\programs\python\python37\lib\site-packages (from pynac
l>=1.0.1->paramiko>=1.15.2->sshtunnel) (1.14.0)
Requirement already satisfied: cffi>=1.4.1 in c:\users\user\appdata\local\programs\python\python37\lib\site-packages (fr
om pynacl>=1.0.1->paramiko>=1.15.2->sshtunnel) (1.14.0)
Requirement already satisfied: pycparser in c:\users\user\appdata\local\programs\python\python37\lib\site-packages (from
cffi>=1.4.1->pynacl>=1.0.1->paramiko>=1.15.2->sshtunnel) (2.20)
Installing collected packages: sshtunnel
Running setup.py install for sshtunnel ... done
Successfully installed sshtunnel-0.1.5
WARNING: You are using pip version 19.2.3, however version 20.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
3. Create a file dept.csv in the folder c:\python that will be imported in Amazon RSD PostgreSQL
DEPT_ID,DEPT_NAME 1,HR 2,IT 3,MARKETING
4. Explanation of the Python program
a. Import the package sshtunnel to create the sshtunnel and the package psycopg2 to connect to Postgress and execute different SQL statements.
from sshtunnel import SSHTunnelForwarder import psycopg2
b. Initialize the parameters to create a ssh tunnel
remote_user = ‘dbadmin’ #==> user of EC2 instance remote_host = ‘3.85.36.103’ #==> Public IP of EC2 instance remote_port = 22 #==> Port used to connect to SSH private_host = ‘postmig.c3qqnqe4hzdh.us-east-1.rds.amazonaws.com’ #==> is the Endpoint of Amazon RDS (Private IP) private_port = 5432 #==> Enabled Port to connect to PostgreSQL local_host = ‘127.0.0.1’ # ==> Local port of the PC run the python program local_port = 5439 #==> The local port of PC
c. Create a SSH tunnel between that allow the connection among Local PC, the EC2 instance and Amazon RDS then start the SSH tunnel.
d. Create a connection to Amazon RDS PostgreSQL using the ssh tunnel and create a cursor connection to execute SQL statements
conn = psycopg2.connect(host=”127.0.0.1″, port = 5439, database=”db_post_mig”, user=”postgres”, password=”oracle”) #Open Connection to PostgreSQL DB cur = conn.cursor() #Open Cursor to execute commands on Database
e. Create a table dept in the schema progreso if it does not exist
cur.execute(“””create table IF NOT EXISTS progreso.dept (DEPT_ID bigint, DEPT_NAME varchar(128));”””) #create a table if does not exist
f. Import the data from the file dep.csv to the table progreso.dept
with open(‘C:\python\dep.csv’) as csv_file: next(csv_file) # Skip the header cur.copy_from(csv_file, ‘progreso.dept’, sep=’,’) conn.commit()
g. Close the cursor, connection and the SSH tunnel.
cur.close() #Close Cursor conn.close() #Close Connection to DB server.stop() #Close Tunnel
5. List of the python programimp_csv_to_RDS_tunnel.py
C:\python>type imp_csv_to_RDS_tunnel.py
from sshtunnel import SSHTunnelForwarder
import psycopg2
remote_user = 'dbadmin'
remote_host = '3.85.36.103'
remote_port = 22
private_host = 'postmig.c3qqnqe4hzdh.us-east-1.rds.amazonaws.com'
private_port = 5432
local_host = '127.0.0.1'
local_port = 5439
server = SSHTunnelForwarder(
(remote_host, remote_port),
ssh_username=remote_user,
ssh_password='********',
remote_bind_address=(private_host, private_port),
local_bind_address=(local_host, local_port),
) #Configure Tunnel
server.start() #Open Tunnel
#Open Connection to PostgreSQL DB
conn = psycopg2.connect(host="127.0.0.1", port = 5439, database="db_post_mig", user="postgres", password="oracle")
cur = conn.cursor() #Open Cursor to execute commands on Database
#create a table if does not exist
cur.execute("""create table IF NOT EXISTS progreso.dept (DEPT_ID bigint, DEPT_NAME varchar(128));""")
#Dpen and upload the file.csv in the table progreso.dept
with open('C:\python\dep.csv') as csv_file:
next(csv_file) # Skip the header
cur.copy_from(csv_file, 'progreso.dept', sep=',')
conn.commit()
cur.close() #Close Cursor
conn.close() #Close Connection to DB
server.stop() #Close Tunnel
6. Verify the program phtyon program imp_csv_to_RDS_tunnel.py
a. Check how many row has the table progreso.dept before run the python program (connected from the EC2 instance)
[dbadmin@dbserver ~]$ psql -h 172.31.17.227 -U postgres -p 5432
Password for user postgres:
psql (12.2)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.
postgres=> \c db_post_mig
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
You are now connected to database "db_post_mig" as user "postgres".
db_post_mig=> select * from progreso.dept;
dept_id | dept_name
---------+-----------
(0 rows)
b. Run the the program phtyon program imp_csv_to_RDS_tunnel.py
C:\python>python imp_csv_to_RDS_tunnel.py
c. Check if were inserted the 3 rows in the table.
db_post_mig=> select * from progreso.dept;
dept_id | dept_name
---------+-----------
1 | HR
2 | IT
3 | MARKETING
(3 rows)
May 9 2020
Importing data from a CVS file to a Amazon RDS PostgreSQL 12.2 using a SSH tunnel created in Python
In this post I will show you a simple python program that import a CVS file from a remote PC into a table of Amazon RDS PostgreSQL using a a SSH tunnel created by the same Python program.
I going to use an Amazon RDS PostgreSQL without a Public IP, for this reason I will use an Amazon EC2 instance as bastion host (Jump server).
Both Amazon RDS PostgreSQL and EC2 Instance use the same VPC (172.31.0.0/16) and the EC2 Instance can connect to Amazon RDS PostgreSQL using the Port 5432 and the python program will connect remotely to EC2 instance using a SSH tunnel.
Current Environment
The local PC run a python program (version 3.75) that executes three things:
1. I going to validate the rules used in Security Groups for the EC2 instance and Amazon RDS using AWS CLI 2.0.10.
a. Validate what is the active Security Group used by the EC2 Instance (i-032f4a47077b4a2ad)
b. Validate the Security Group “SECURITY_TO_EC2_POSTGRESQL” grant access only to the PC (Public IP: x1.x2.x3.x4) using the port 22 (SSH)
c. Validate what is the active Security Group used by the Amazon RDS DB (i-032f4a47077b4a2ad)
d. Validate the Security Group (Group ID sg-0d30f4aeed419af97) used by Amazon RDS DB only grant access trough the port 5432 to the EC2 Instance (located in the same VPC)
2. Check if the packages psycopg2 and sshtunnel were installed in python, if they were not , install them using pip (Python Package Installer)
3. Create a file dept.csv in the folder c:\python that will be imported in Amazon RSD PostgreSQL
DEPT_ID,DEPT_NAME
1,HR
2,IT
3,MARKETING
4. Explanation of the Python program
a. Import the package sshtunnel to create the sshtunnel and the package psycopg2 to connect to Postgress and execute different SQL statements.
import psycopg2
b. Initialize the parameters to create a ssh tunnel
remote_host = ‘3.85.36.103’ #==> Public IP of EC2 instance
remote_port = 22 #==> Port used to connect to SSH
private_host = ‘postmig.c3qqnqe4hzdh.us-east-1.rds.amazonaws.com’ #==> is the Endpoint of Amazon RDS (Private IP)
private_port = 5432 #==> Enabled Port to connect to PostgreSQL
local_host = ‘127.0.0.1’ # ==> Local port of the PC run the python program
local_port = 5439 #==> The local port of PC
c. Create a SSH tunnel between that allow the connection among Local PC, the EC2 instance and Amazon RDS then start the SSH tunnel.
(remote_host, remote_port),
ssh_username=remote_user,
ssh_password=’********’,
remote_bind_address=(private_host, private_port),
local_bind_address=(local_host, local_port),
) #Configure Tunnel
server.start() #Open Tunnel
d. Create a connection to Amazon RDS PostgreSQL using the ssh tunnel and create a cursor connection to execute SQL statements
cur = conn.cursor() #Open Cursor to execute commands on Database
e. Create a table dept in the schema progreso if it does not exist
f. Import the data from the file dep.csv to the table progreso.dept
next(csv_file) # Skip the header
cur.copy_from(csv_file, ‘progreso.dept’, sep=’,’)
conn.commit()
g. Close the cursor, connection and the SSH tunnel.
conn.close() #Close Connection to DB
server.stop() #Close Tunnel
5. List of the python program imp_csv_to_RDS_tunnel.py
6. Verify the program phtyon program imp_csv_to_RDS_tunnel.py
a. Check how many row has the table progreso.dept before run the python program (connected from the EC2 instance)
b. Run the the program phtyon program imp_csv_to_RDS_tunnel.py
c. Check if were inserted the 3 rows in the table.
By DBA_JAF • Amazon RDS, Amazon Web Services (AWS), Cloud, PostgreSQL, Programming Languages, Python • 0