What is SQL injection and why it matters for tools that generate SQL

SQL injection is one of the oldest and most well-documented attack classes in web security. It has been on the OWASP Top 10 for decades, it was responsible for some of the largest data breaches of the 2000s and 2010s, and it still appears regularly in security disclosures today.

If you’re using a tool that generates SQL on your behalf, understanding SQL injection is relevant — not to alarm you, but so you know what to look for.

What SQL injection is

SQL injection happens when user-supplied input is incorporated into a SQL query without proper sanitisation, and the input contains SQL syntax that changes what the query does.

The classic example: a login form that builds a query like this:

SELECT * FROM users WHERE username = 'INPUT' AND password = 'INPUT';

If a user enters ' OR '1'='1 as the username, the query becomes:

SELECT * FROM users WHERE username = '' OR '1'='1' AND password = '...';

Because '1'='1' is always true, this query returns all users — and the attacker is logged in without a valid password.

More destructive variations:

-- Drop a table
'; DROP TABLE users; --

-- Exfiltrate data from another table
' UNION SELECT username, password FROM admin_users --

-- Bypass a condition entirely
' OR 1=1 --

How parameterized queries prevent it

The standard defence against SQL injection is parameterized queries (also called prepared statements). Instead of concatenating input into the query string, the query is sent to the database with placeholders, and the input values are sent separately.

# Vulnerable: string concatenation
query = f"SELECT * FROM users WHERE username = '{username}'"

# Safe: parameterized query
cursor.execute("SELECT * FROM users WHERE username = %s", (username,))

With parameterized queries, the database engine treats the input as a literal value, never as SQL syntax. No matter what characters the input contains, it cannot change the structure of the query.

This is not optional. Applications that construct SQL by concatenating user input are vulnerable by default.

What changes when a tool generates SQL for you

When you use a tool that translates natural language into SQL, you are not directly writing queries — the tool is generating them. This changes the risk profile:

The input is your prompt text, not raw SQL. The tool receives your natural language, generates SQL, and executes it. Your prompt is not incorporated into the SQL as a string (at least in a well-built tool) — it’s interpreted by a language model to produce a query.

The generated SQL should be parameterized, not string-interpolated, when it contains any values derived from your input. For example, if you type “show orders from customer John Smith”, the generated query should use a parameter for “John Smith”:

SELECT * FROM orders WHERE customer_name = $1
-- with parameter: 'John Smith'

Not:

SELECT * FROM orders WHERE customer_name = 'John Smith'
-- where 'John Smith' came from user input and was concatenated in

Read-only enforcement is a meaningful second layer. Even if a generated query were somehow manipulated to contain malicious SQL, a read-only database connection means INSERT, UPDATE, DELETE, and DROP statements would fail at the database level.

What to evaluate in a SQL-generating tool

When assessing any tool that generates and runs SQL on your behalf, these are the relevant questions:

Does the tool use parameterized queries?

If the tool generates SQL and executes it against your database, the values in the WHERE clause should be passed as parameters, not interpolated into the query string. This is a technical implementation detail that may not be easy to verify from the outside, but you can ask the vendor directly.

Does it enforce read-only access?

A tool that blocks all write operations — INSERT, UPDATE, DELETE, DROP, ALTER, TRUNCATE — provides meaningful protection. Even if a query were manipulated to attempt a write, it would be rejected.

This is separate from what database user you connect with. Application-level enforcement adds a second check before the query ever reaches the database.

Does it validate or reject dangerous query patterns?

Beyond just write operations, a responsible tool should validate that generated queries conform to expected patterns before executing them.

Is the generated SQL visible to you?

Transparency matters. You should be able to see the exact SQL that will run before it runs, or at minimum in the query history afterwards. This lets you catch unexpected patterns.

The honest picture

SQL injection against a natural language query tool is a different attack surface than SQL injection against a traditional web application. The typical attack vector (user entering malicious characters into a form) doesn’t apply in the same way.

The relevant risk is whether the tool’s SQL generation is robust against edge cases and whether it protects against unintended queries. Read-only enforcement and parameterized execution are the right controls for this context.

These aren’t exotic safeguards — they’re standard engineering practice. Ask about them when evaluating any tool that touches your database.