During my recent engagement, I faced an interesting challenge of accessing hundreds (956 to be exact) of RDS databases at scale. The task was further complicated by the fact that I could only use Active Directory credentials to access these databases since they were attached to a network domain. Additionally, I had a hard time finding any concise guides on using Python to access MSSQL RDS databases online with JDBC. Given the complexity of the task and the lack of existing resources, I decided to document my solution in this blog post in hopes of aiding others who may find themselves in a similar predicament.
First, let’s set up the challenge at hand: we have hundreds of RDS databases with varying locations and names, and we need to access all of them using our corporate Active Directory credentials. To keep it platform-independent, we’ll use JDBC, a database access protocol that allows us to connect to a database in a platform-independent manner.
The Python library jaydebeapi is an interface to Java JDBC that allows Python programs to use JDBC to connect to databases. Using this, we can connect to each of our databases in a scalable manner. Here is the script that helped me get through this situation:
The snippet above is primarily used for the setup. In the .env file, the user should store their active directory credentials and the file path of the JDBC jar which can be found here.
Now, let’s see how we can connect to these databases and execute a query:
Since my laptop wasn't connected to the network domain, I couldn't use windows authentication to connect to the RDS databases. Instead I ended up using NTLM authentication (line 4) which worked perfectly.
Error handling is also in place for any issues that may occur while executing the query or connecting to the server. Since we are using a JAR file for connecting to the databases, the JAR can throw its own errors which need to be caught separately.
The rest of the code initiates a connection to each database server and sends a simple SELECT @@VERSION; query to test the connection. Any SQL results are fetched and printed out.
In conclusion, the lack of online resources does not deter a determined programmer. With a combination of Python, JDBC, and a well-structured approach, I was able to effectively access hundreds of RDS databases at scale using Active Directory credentials. I hope this solution serves as a useful guide for anyone facing a similar situation or simply trying to understand the interaction between Python, JDBC, and MS SQL RDS databases. Remember, every problem is just an opportunity for a creative solution!