PG TIMETABLE V4 is now available for download.

Home > Enteros’ Blog – Thoughts on Database Technology, Machine / Deep learning, and a Generative AI > Deep learning > PG TIMETABLE V4 is now available for download.

Article

We are pleased to announce a brand-new major pg_timetable v4 release!

This time, a ton of work was put into it, and the pg_timetable v4 release is incompatible with earlier iterations. Job migration is simple, though, so nobody should be discouraged from updating.

We added extensive read the documentation in the docs and will continue to improve it Please let us know if there are any areas that should be highlighted or new topics added.

Check out some of the fantastic new features right away! You can see a full list of changes on the official release page, which also has links to issues, commits, and pull requests.

I would like to remind you that pg_timetable is a collaborative effort. Therefore, don’t be afraid to ask any questions, report any bugs, bookmark the pg_timetable project, and let everyone know about it.

pg_timetable v3 versus v4

The earlier version is still regarded as stable. There are no known bugs or serious problems with how it works, so it is safe to keep using it.

If new bugs are found, it won’t be backpatched, though. Furthermore, the v3 branch will not get any new features added.

Performance

We made several efforts to improve performance.

The database schema was first overhauled and made as simple as possible. This will give the real workers more bandwidth and connections because the new pg_timetable will only have to handle a smaller number of system queries.

Second, we replaced PL/PgSQL with pure SQL when rewriting cron-related functions. It enables us to eliminate tiresome debugging in addition to improving performance.

The extensive logging system makes up the third pillar. Details can be found in the separate section below.

File Configuration Support

If you only use command-line parameters and environment variables to handle configuration, it’s easy to make mistakes and takes a long time.

Rules of common sense suggest using configuration files, in this case. We have provided you with a comprehensive, self-explanatory config.example.yaml file.

Naturally, you have the option to use a unique set of options or not use it at all. To improve security, it seems like a good idea to put general options in the configuration file and connection options in the environment variables.

built-in CopyFromFile task

One of the most common things database schedulers do is bring data from outside sources, like sensors, logs, official statistics, etc., into the database. For these types of data, the de facto standard is the CSV format. The copy command in PostgreSQL can import CSV files. However, the file must be stored in the server’s file system for the PostgreSQL process to have access to it in order to use this function. Especially in the cloud era, that is not always the case for production environments.

To get around this restriction in the older version of pg_timetable, we recommend using the psql copy command, which runs a frontend (client) copy. This is an operation that runs an SQLCOPY command, but instead of the server reading or writing the specified file, psql reads or writes the file and routes the data between the server and the local file system. This means that the local user, not the server, has access to and privileges over the files, and no SQL superuser privileges are needed.

You no longer require any extra software to perform client copy. One can use the new built-in CopyFromFile task. The manual may contain the entire example. As you can see in the example below, the task accepts JSON parameters in the form of objects.

SQL statement: "COPY location FROM STDIN WITH DELIMITER '|' NULL 'NULL>'", filename: "download/orte_ansi.txt"

Here sql is the correct COPY .. FROM STDIN command. One can use any of the standard clauses controlling file specification, e.g. file encoding, condition for insert, escaping, quoting, etc.
And the filename is the correct path (absolute or relative to pg_timetable) of the source CSV file.

This task perfectly works for all formats supported by PostgreSQL: text, csv and binary.

improved logging

One of the key architectural decisions made for pg_timetable was the database-driven concept. In other words, scheduler sessions are entirely managed by the database, and all output is saved there as well for later analysis.

File logs

Despite this, people were using output redirection to store logs in files, such as

 pg_timetable -c loader -u scheduler -d timetable > pgtt.log

The disadvantage of this method is that data is stored as plain text, which is difficult for machines to process. In the new version, we have introduced the ability to duplicate logs to a file system using --log-file command-line parameter. Moreover, the user can choose the exact format of the underlying file, specifying --log-file-format: json or text.

database logs’ efficiency

Another issue that required attention was the database logging. Heavy clients were producing a lot of information to be stored by separate INSERT statements, causing intensive use of connections. We have made a great implementation of a log hook for the well-known logrus package.

The key benefits are:

usage of the COPY machinery;
a cache that is controlled by the time and the number of logged entries;
in cases of extremely high load, the ability to skip entry logging

I personally encourage you to incorporate our log hook into your projects. Let us know if you think we should make it a standalone project.

Tasks output

pg_timetable has been storing output for program tasks from the very beginning. It is helpful for incident analysis and debugging at a previous time point. It wasn’t until Nikolay Samokhvalov made the suggestion that we realized this feature could be used with the built-in SQL tasks. So now one can have SQL task with SELECT statement to determine which output will be saved upon execution, e.g.

"SELECT * FROM timetable.active_session"); add_job --------- 1 (1 row) ... my_database=> FROM timetable.execution_log, SELECT chain_id, finished, and output; chain_id | finished | output ----------+-------------------------------+---------- 1 | 2021-08-12 15:40:46.064795+02 | SELECT 2 (1 row)

The SQL task gives back the tag of the command that was run and the number of rows that were returned or changed. We don’t store the result set because commands can be as simple or as complicated as you want. Users should be cautious and explicitly store any important information, such as

CREATE TABLE my_database (SELECT timetable.add_job('dump-sessions', "* * * "), "INSERT INTO history_session SELECT now(), * FROM timetable.active_session'); add_job --------- 3 (1 row) ... SELECT * FROM my_database history_session; ts | client_pid | client_name | server_pid -------------------------------+------------+-------------+------------ 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 15676 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 16436 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 14892 2021-08-12 15:52:02.378981+02 | 10440 | loader-foo | 25480 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 21876 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 11852 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 10052 15:53:02.446911+02 | 19664 | writer-boo | 27396 (8 rows) 15:53:02.446911+02 | 19664 | writer-boo | 27396 (8 rows) 2021-08-12 15:53:02.446911+02 | 19664 | writer-boo | 27396 (8 rows) my_database=> SELECT chain_id, finished, output FROM timetable execution_log; chain_id | finished | output ----------+-------------------------------+------------ 1 | 2021-08-12 15:40:46.064795+02 | SELECT 2 1 | 2021-08-12 15:52:02.380711+02 | SELECT 4 3 | 2021-08-12 15:52:02.382296+02 | INSERT 0 4 1 | 2021-08-12 15:52:07.34872+02 | SELECT 3 3 | 2021-08-12 15:52:07.448914+02 | INSERT 0 4 (5 rows)

Resource Choices

Our scheduler should be as adaptable as possible. We have therefore added the resource management options category:

--cron-workers: number of concurrent workers for scheduled chains (16 by default);
--interval-workers; number of concurrent workers for interval chains (16 by default);
--chain-timeout: any chain that takes more time than the allotted milliseconds should be aborted;
--task-timeout: Stop any task in a chain that takes longer than the predetermined number of steps.
of milliseconds.

Finally

This was the first piece in a series on the new capabilities of pg_timetable v4. The best features will be emphasized, so keep checking this space.

About Enteros

Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Enteros and Cloud FinOps: Driving Database Performance and Observability in the Financial Sector

19 February 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Optimizing Cost Estimation in the Tech Sector: How Enteros Leverages Logical Models and Cloud FinOps for Smarter Database Management

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Enteros and Cloud FinOps: Transforming Database Performance and Observability in the Real Estate Sector

18 February 2025
Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Optimizing Database Performance for EdTech Firms: How Enteros Enhances RevOps with Advanced Database Performance Software

Database Performance Management

In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…

Article

pg_timetable v3 versus v4

Performance

File Configuration Support

built-in CopyFromFile task

improved logging

File logs

database logs’ efficiency

Tasks output

Resource Choices

Finally

About Enteros

RELATED POSTS

Enteros and Cloud FinOps: Driving Database Performance and Observability in the Financial Sector

Optimizing Cost Estimation in the Tech Sector: How Enteros Leverages Logical Models and Cloud FinOps for Smarter Database Management

Enteros and Cloud FinOps: Transforming Database Performance and Observability in the Real Estate Sector

Optimizing Database Performance for EdTech Firms: How Enteros Enhances RevOps with Advanced Database Performance Software