Article
Actually, there are numerous justifications for why someone would wish to export pgAgent jobs to a pg timetable. The majority of them can be found by looking at the supported platforms table from the official readme and the PostgreSQL schedulers comparison table I introduced in one of my earlier postings.
In my earlier article, I outlined the rationale for and approaches for moving scheduled jobs from pg cron to pg timetable.
Distinctions and Similarities
- Both applications have a multi-step job architecture. While pgAgent utilizes “job” and “step” vocabulary, pgTimetable uses “chain” and “task” terminology.
- pgAgent has the ability to run BATCH (shell) and SQL commands. The job for the PG timetable might be one of SQL, PROGRAM, or BUILTIN. Therefore, pg timetable will execute a PROGRAM job that calls shell execution, such as bash-c “command,” in order to mimic pgAgent’s BATCH step.
- For a job, pgAgent lets you specify a variety of schedules. If many schedules are discovered, the chain definition should be duplicated for each of them.
- pgTimetable employs cron notation to store scheduling data, whereas pgAgent stores data in boolean arrays. We must offer a method of converting the array to cron notation.
- Disregarding task faults is enabled by both pgAgent and pgTimetable. There are two conceivable outcomes: failing the task and moving on; and failing the task but ignoring it and continuing. Using pg timetable, only ignoring the error without labeling it as successful is possible.
- pgTimetable is more adaptable in how control is implemented using the client name than pgAgent, which can restrict task execution by the agent’s host. Here, we’ll suppose that after migration, the host name changes to the client name.
- Setting a schedule’s start and end timestamps as well as whether or not it is enabled is possible with pgAgent. Only schedules that are enabled and active right now and fall within the date range will be migrated.
Prerequisites
- The pgagent extension has been added to your database.
- You have scheduled exporting PGA jobs.
- By executing the pg timetable at least once against this database, you have built the pg timetable schema. The schema will be automatically generated. Whether you have already added some chains or not is irrelevant.
Full description of the migration script from pgAgent to pgTimetable
Use this SQL snippet to export every available piece of information as precisely as you can. I’ll walk you through it step by step.
bool_array_to_cron()
We need to convert boolean arrays that store execution data for minutes, hours, days, and months to correct cron notation, as I said earlier.
The first thing to note is that arrays come in a variety of sizes, such as 60 for minutes and 12 for months, among others. Second, for hours and minutes, the cron clause starts at 0, while for other units, it starts at 1.
The secret is straightforward: unnest the input array, link it with order numbers, remove rows with false values, then combine the numbers into a string that can be used by cron. Use asterisk cron notation if all of the items in the array are false.
CREATE OR REPLACE FUNCTION bool_array_to_cron(bool[], start_with int4 DEFAULT 0) RETURNS TEXT AS $$ WITH u AS ( SELECT unnest($1) e, generate_series($2, array_length($1, 1)-1+$2) AS i ) SELECT COALESCE(string_agg(i::text, ','), '*') FROM u WHERE e $$ LANGUAGE sql;
What shell to employ
The cross-platform program pg timetable can function on the majority of the popular systems. As a result, we need to be certain which shell will be utilized for batch pgAgent tasks. The obvious option on Unix-like platforms would be sh -c “cmd”. It might be either pwsh-c “command” or cmd/C “command” on Windows.
That is exactly what the thecte shell clause accomplishes. Choosing the default shell to use is simple:
cte_shell(shell, cmd_param) AS ( VALUES ('sh', '-c') )
Please pay attention to the naming pattern I’m employing here. The final result is produced if the CTE clause begins with “cte”. In the meantime, CTE clauses that start with “pga_*” get pgAgent-specific result sets ready for transformation.
pga schedule
We want to return all pgAgent schedules with cron notation that are currently active in this CTE clause. Later, the chain name will incorporate the schedule name.
pga_schedule AS ( SELECT s.jscjobid, s.jscname, format('%s %s %s %s %s', bool_array_to_cron(s.jscminutes), bool_array_to_cron(s.jschours), bool_array_to_cron(s.jscmonthdays), bool_array_to_cron(s.jscmonths, 1), bool_array_to_cron(s.jscweekdays, 1)) AS schedule FROM pgagent.pga_schedule s WHERE s.jscenabled AND now() < COALESCE(s.jscend,'infinity'::timestamptz) AND now() > s.jscstart )
pga_chain and cte_chain
All scheduled pgAgent jobs will have their results returned by this clause. We cannot use jobid directly because the destination database may already include some chains. Instead, we wish to use the sequence to produce the subsequent chain id.
We specify jobname as a concatenation of the job name and the schedule name because pgAgent may contain many schedules for the same job. All chosen rows will be added to the timetable by cte chain. chain-link table.
pga_chain AS ( SELECT nextval('timetable.chain_chain_id_seq'::regclass) AS chain_id, jobid, format('%s @ %s', jobname, jscname) AS jobname, jobhostagent, jobenabled, schedule FROM pgagent.pga_job JOIN pga_schedule ON jobid = jscjobid ), cte_chain AS ( INSERT INTO timetable.chain (chain_id, chain_name, client_name, run_at, live) SELECT chain_id, jobname, jobhostagent, schedule, jobenabled FROM pga_chain )
pga_step
The most challenging part of this clause is this one. So allow me to clarify the situation. Similar to how we generated the chain id, we also want to generate the task id.
The pgAgent steps are executed in name-alphanumeric order. The job order is specified by the numeric number in the PG timetable. Here, we’re utilizing the rank() window method to provide these numerical results.
The connecting string is where it gets the toughest. The SQL tasks can be executed remotely using pg_timetable and pgAgent. You can specify the connection string for a customized remote cluster or the database name to conduct a task on the same cluster under the pgAgent settings. In both instances, pg timetable employs connection string notation. The reasoning is as follows:
- If a connection string is provided, use it. Use NULLIF to ignore empty strings ().
- Then verify that the database name matches the name of the existing database. If so, keep silent.
- If not, create the appropriate connection string using the supplied database name.
pga_step AS ( SELECT c.chain_id, nextval('timetable.task_task_id_seq'::regclass) AS task_id, rank() OVER (ORDER BY jstname) AS jstorder, jstid, jstname, jstenabled, CASE jstkind WHEN 'b' THEN 'PROGRAM' ELSE 'SQL' END AS jstkind, jstcode, COALESCE ( NULLIF(jstconnstr, ''), CASE WHEN jstdbname = current_database() THEN NULL WHEN jstdbname > '' THEN 'dbname=' || jstdbname END ) AS jstconnstr, jstonerror != 'f' AS jstignoreerror FROM pga_chain c JOIN pgagent.pga_jobstep js ON c.jobid = js.jstjobid )
cte_tasks and cte_parameters
These clauses are rather simple to understand. The PROGRAM tasks’ specification in the PG timetable is the only item that needs explanation. The program name (perhaps with the complete path) should be the only thing in the command column. It’s the name of the shell that we specified in the first step in our example. The timetable contains all command-line options used to run the shell. parameters list. We don’t do any quote escaping here, so pay attention. All parameter values are handled automatically by the pg timetable.
cte_tasks AS ( INSERT INTO timetable.task(task_id, chain_id, task_name, task_order, kind, command, database_connection) SELECT task_id, chain_id, jstname, jstorder, jstkind::timetable.command_kind, CASE jstkind WHEN 'SQL' THEN jstcode ELSE sh.shell END, jstconnstr FROM pga_step, cte_shell sh ), cte_parameters AS ( INSERT INTO timetable.parameter (task_id, order_id, value) SELECT task_id, 1, jsonb_build_array(sh.cmd_param, s.jstcode) FROM pga_step s, cte_shell sh WHERE s.jstkind = 'PROGRAM' )
Finally…
Finally, when you execute the migration script, you might see something like this:
chain_id|jobid|jobname |jobhostagent|jobenabled|schedule | --------+-----+------------------------------------------------+------------+----------+--------------------------------------------------------------------------------------------------------+ 91| 1|Download locations and aggregate @ every min |node1 |true |* * * * * | 92| 1|Download locations and aggregate @ test schedule|node1 |true |2 3,6 3 4 1 | 93| 1|Download locations and aggregate @ run download |node1 |true |0,5 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 * 1,2,3,4,5,6,7,8,9,10,11,12 2,3,4,5,6|
From this specific example, it can be deduced that pgAgent has one task with the title “Download locations and aggregate.” It has been scheduled. Three schedules are enabled for that job: “every min,” “test schedule,” and “execute download.”
To run the migrated chain, the Pg timetable should be run with the argument —clientname=node1 on the command line.
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Enhancing Identity and Access Management in Healthcare with Enteros
- 19 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Maximizing Efficiency with Enteros: Revolutionizing Cost Allocation Through a Cloud Center of Excellence
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Driving Efficiency in the Transportation Sector: Enteros’ Cloud FinOps and Database Optimization Solutions
- 18 November 2024
- Database Performance Management
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…
Empowering Nonprofits with Enteros: Optimizing Cloud Resources Through AIOps Platform
In the fast-evolving world of finance, where banking and insurance sectors rely on massive data streams for real-time decisions, efficient anomaly man…