Data types for query parameters and performance

Home > Enteros’ Blog – Thoughts on Database Technology, Machine / Deep learning, and a Generative AI > Database Performance Management > Data types for query parameters and performance

Preamble

Recently, I was able to assist a customer in resolving a “mysterious” performance issue. The problem was that the type of query parameter wasn’t chosen well, which led to poor query performance. I decided to write this article after working on this customer’s case and realizing that not everyone understands what a query parameter’s data type means.

What are query parameters?

In SQL, it’s common to see queries that only have constant value differences.

choose the value from the large collection where the id is 42, the id is 4711, and the id is 1001;

It makes sense to consider them as different expressions of the same problem.

SELECT val FROM large WITH id equal to $1;

Here, $1 is a placeholder for an arbitrary value. Such a placeholder is known as a query parameter, also known as a bind variable. You must supply a real parameter value as an argument for the parameter when running the query.

A placeholder offers the following advantages:

The database can reuse the execution plan multiple times with different parameter values and only parse and optimize the statement once to speed up processing (read this article to learn how PostgreSQL does that).
Separating the query and parameters eliminates SQL injection, a serious threat to database applications.

Support for query parameters in PostgreSQL

There are two methods for sending an SQL statement using the PostgreSQL frontend/backend protocol:

The whole SQL statement is sent as a single string using the simple query protocol.
The SQL statement and its parameters are separated by the extended query protocol:
- the “Prepare” message sends the statement with the placeholders $1, $1 etc.
- the “Bind” message sends the parameter values
- the “Execute” message executes the query

Keep in mind that query parameters are only applicable to constants. Table, column, and schema names cannot be used as parameters for identifiers. Also, parameters are only supported in SELECT, INSERT, UPDATE and DELETE.

Prepared statements

You can name a statement when you set it up so that it can be executed with various inputs. Every single database client API supports a prepared statement.

Using Java and JDBC, this would operate as follows:

 java.sql.PreparedStatement stmt = 	conn.prepareStatement("SELECT * FROM large WHERE id = ?"); stmt.setLong(1, 42L); java.sql.ResultSet rs = stmt.executeQuery();

The placeholder is always ?, and the first argument of the set* methods specifies the parameter number.

In PostgreSQL, you can also use prepared statements in SQL. PREPARE creates a prepared statement, and EXECUTE executes it:

 PREPARE stmt(bigint) AS SELECT val FROM large WHERE id = $1; EXECUTE stmt(42);

Query parameter data types and type conversion

You may have noticed that in both of the examples above, the argument had a data type. In Java, it was long, and in SQL, it was bigint. You do not need to define a data type, though. PostgreSQL will then attempt to determine the data type on its own. The data type “unknown” in PostgreSQL is used to represent such an “untyped value.”

For data types other than unknown, PostgreSQL will also take type conversions into account. For instance, PostgreSQL lacks an equality operator to compare bigint and numeric values:

SELECT o.operatorname FROM pg_operator AS o ON JOIN pg_amop AS ao o.oid = ao.amopopr JOIN pg_am AS a ON a.oid = ao.amopmethod WHERE /* an equality operator as used in B-tree indexes */ ao.amopstrategy = 3 AND a.amname = 'btree' AND (o.oprleft, o.oprright) = ('bigint'::regtype, 'numeric'::regtype); oprname ═════════ (0 rows)

But a numeric is the best data type for numbers, so PostgreSQL will automatically change a bigint to a numeric. As a result, a bigint and a numeric can be compared. The documentation contains the precise guidelines that apply to these conversions. Due to PostgreSQL’s capability for function and operator overloading, these regulations are extremely complex.

An example of the performance impact of query parameter types

To demonstrate that, let’s look at this straightforward case:

CREATE TABLE LARGE (VALUES DOUBLE PRECISION NOT NULL, ID BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY); INSERT INTO large (val) SELECT random() FROM generate_series (1, 100000), then vacuum (analyze) large.

Then, we develop a Java program that queries the database using the snippet of code from above. However, we set the parameter in three different ways. We record the server’s execution plan in each instance using the auto_explain extension.

Using `integer` as the parameter type

The parameter is set with

 stmt.setInt(1, 42);

and the resulting execution plan is

Query Text: SELECT * FROM large WHERE id = $1 Index Scan on large (actual time = 0.013..0.013 rows = 1 loops = 1) Index Cond: (id = 42) Buffers: shared hit=3

This works fine, since setInt marks the parameter as type integer, and there is an equality operator to compare integer and bigint.

Using `unknown` as the parameter type

The parameter is set as type java.sql.Types.OTHER with

 stmt.setObject(1, 42);

and the resulting execution plan is

 Query Text: SELECT * FROM large WHERE id = $1 Index Scan using large_pkey on large (...) (actual time=0.005..0.006 rows=1 loops=1) Index Cond: (id = 42) Buffers: shared hit=3

This works just as well, because a parameter of type unknown is inferred to have the same type as id, which is bigint.

Using `numeric` as the parameter type

The parameter is set with

 stmt.setBigDecimal(1, java.math.BigDecimal.valueOf(42));

and the resulting execution plan is

SELECT * FROM large WHERE id = $1 Seq Scan on large (...) (actual time = 0.012..11.123 rows = 1 loops = 1) Filter: ((id)::number = '42'::number)) Rows Removed by Filter: 99999 Buffers: shared hit=935

Unexpectedly, PostgreSQL used a sequential scan, and the results are substantially worse. The JDBC driver maps java.math, which is why. BigDecimal -> numeric, resulting in a numeric data type for the parameter. Bigint and numeric do not have an equality operator, as we saw previously, thus PostgreSQL converts them both to numeric. The execution schedule shown above makes this very clear. It cannot be utilized in this situation since the main key index is defined on “id” rather than “id::numeric.”

How to avoid problems with query parameter data types

We can get the following conclusion from the foregoing:

Pick the appropriate data type for the query parameters, ideally. Given that the JDBC driver converts long to bigint in the example above, using setLong would have been the appropriate choice.
Select an untyped parameter (type unknown) if you’re unsure and let PostgreSQL determine the proper data type.

Sometimes neither of these approaches will work since PostgreSQL makes a bad estimate because there is no data type in your host language that corresponds to the PostgreSQL data type. This example would be a good one:

 SELECT id, name FROM person WHERE birthday <= current_timestamp - $1;

We want to supply an interval like “1 year” or “1-0”. Java does not have a data type for intervals, so we would have to send an unknown parameter. However, PostgreSQL infers that we mean the operator

 timestamp with time zone - timestamp with time zone

rather than

 timestamp with time zone - interval

As a result, we will receive error messages stating that “1 year” is an invalid timestamp. The answer in this situation is to explicitly type cast the statement.

SELECT id, name FROM person WHERE birthday <= current_timestamp - CAST ($1 AS interval);

As a result, we must add a third item to the list above:

Add an explicit type cast to the statement if automatic type resolution fails.

Conclusion

It is simple to forget that query performance might be affected by the data type of a query parameter. Avoid issues by either not specifying a type and letting PostgreSQL handle it correctly or by explicitly specifying the right type.

About Enteros

Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Enteros and Generative AI in the Insurance Sector: Performance Monitoring and Enterprise Efficiency

21 February 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enhancing Real Estate IT Efficiency with Enteros: Leveraging Observability, Cloud FinOps, and Database Performance Optimization

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enteros and CloudTech: Database Performance and RevOps in the BFSI Sector

20 February 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Optimizing Cost Attribution and Estimation in Healthcare: Enhancing Database Performance Monitoring with Enteros

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Preamble

What are query parameters?

Support for query parameters in PostgreSQL

Prepared statements

Query parameter data types and type conversion

An example of the performance impact of query parameter types

Using integer as the parameter type

Using unknown as the parameter type

Using numeric as the parameter type

How to avoid problems with query parameter data types

Conclusion

About Enteros

RELATED POSTS

Enteros and Generative AI in the Insurance Sector: Performance Monitoring and Enterprise Efficiency

Enhancing Real Estate IT Efficiency with Enteros: Leveraging Observability, Cloud FinOps, and Database Performance Optimization

Enteros and CloudTech: Database Performance and RevOps in the BFSI Sector

Optimizing Cost Attribution and Estimation in Healthcare: Enhancing Database Performance Monitoring with Enteros

Using `integer` as the parameter type

Using `unknown` as the parameter type

Using `numeric` as the parameter type