Cron & Interval Scheduling Logic

Time-series platforms live and die by when their tasks fire. A downsampling job that overlaps its previous run doubles-writes aggregates; a retention sweep that skips a daylight-saving boundary leaves an hour of raw data unpurged; an interval task whose window is shorter than late-arriving IoT packets silently drops readings. Every one of these failures traces back to the same root decision: whether a task is triggered by a calendar-aligned cron expression or a fixed-duration interval, and how its window, offset, and concurrency are configured. This page is the reference for getting that decision right within Automated Task Scheduling & Orchestration, covering the temporal semantics, runnable configurations, and the failure modes that surface only in production.

The failure scenario this solves

Consider a fleet of vibration sensors writing to a raw_telemetry bucket at 1 Hz. An engineer schedules a downsampling task with every: 15m and a query window of range(start: -15m). It passes review and works in staging. In production, edge gateways buffer readings during a cellular outage and flush a batch that is timestamped 12 minutes in the past. Because the task window only looks back 15 minutes and the query executes at the top of the wall-clock quarter-hour, the late batch lands after the window closes for that period and before the next window opens — the readings are never aggregated. No error is raised. The _tasks system bucket shows every run succeeding.

The fix is not “make the window bigger” applied blindly. It is understanding that interval scheduling anchors relative to run time, that cron scheduling anchors to a calendar grid, and that offset exists precisely to let ingestion buffers drain before a window is read. The sections below make those mechanics concrete so you can size windows and offsets against your actual late-arrival distribution rather than guessing.

Prerequisites

InfluxDB 2.7+ or InfluxDB 3.x with the task engine enabled (Flux tasks are supported through the 2.x line and in 3.x Cloud Dedicated).
Flux 0.x query language (bundled with the above; no separate install).
Two buckets provisioned: a short-retention source (e.g. raw_telemetry) and a longer-retention destination (e.g. downsampled_telemetry). Bucket sizing follows the InfluxDB Data Lifecycle & Architecture Fundamentals guidance.
An operator or all-access token with read/write on both buckets and write on _tasks.
For programmatic provisioning: Python 3.9+ and influxdb-client 1.36+.
Node clocks synchronized via NTP or chrony (drift above a few seconds corrupts interval anchoring).

Core concept: how the two triggers anchor time

An InfluxDB task is a Flux script whose option task record carries exactly one of two mutually exclusive keys: cron or every. That single choice changes how the scheduler computes the next fire time and, critically, what -task.every or the implied window resolves to inside your query.

Cron scheduling aligns execution to an absolute calendar grid using a six- or five-field expression (second? minute hour day-of-month month day-of-week). The scheduler computes the next timestamp that matches the pattern in UTC and fires there regardless of how long the previous run took. This makes cron ideal for human-aligned obligations — end-of-day rollups, midnight retention sweeps, business-hour compliance reports — where the timestamp matters more than the spacing.

Interval scheduling fires every fixed duration measured from a zero epoch, not from the previous run’s completion. With every: 15m, runs land at :00, :15, :30, :45 past the hour — the interval is snapped to the duration boundary, which is why two nodes with synchronized clocks compute identical fire times. Intervals shine for continuous, spacing-sensitive work: rolling downsampling, sliding-window anomaly checks, and the kind of steady data hygiene described in downsampling & aggregation pipeline design.

The offset option is orthogonal to both: it delays the actual execution by a fixed duration after the computed fire time, without shifting the query’s data window. A task with every: 1h, offset: 5m fires at 01:05, 02:05, … but still reads the 00:00–01:00 window. That gap is where late data drains in.

Cron fires on the UTC calendar grid; interval snaps to the duration boundary; offset delays execution without moving the read window, giving buffered IoT batches time to arrive before the window is read.

Both models evaluate against UTC. There is no per-task timezone field; regional alignment is achieved by translating local wall-clock times to UTC and, where daylight saving matters, handling the shift inside the query or accepting a conservative fixed offset. The dedicated child page on configuring cron expressions for timezone-aware InfluxDB tasks walks through DST-safe templates end to end.

Step-by-step implementation

1. Define a calendar-aligned cron task

Use cron when the trigger must land on a specific clock time. The expression 0 2 * * * fires at 02:00 UTC daily. The offset: 15m delays execution to 02:15, giving the previous day’s final writes time to settle before the rollup reads them.

flux

option task = {
    name: "daily_compliance_rollup",
    cron: "0 2 * * *",
    offset: 15m,
}

from(bucket: "raw_telemetry")
    |> range(start: -1d)
    |> filter(fn: (r) => r._measurement == "sensor_readings")
    |> aggregateWindow(every: 1h, fn: mean, createEmpty: false)
    |> to(bucket: "aggregated_metrics")

The critical parameter here is createEmpty: false — it stops aggregateWindow from emitting null-valued rows for hours where a sensor reported nothing, which would otherwise inflate cardinality and storage in the destination bucket.

2. Define a drift-resistant interval task

Use every for continuous cadences. Note the window range(start: -task.every): binding the window to the task option keeps the query self-consistent if you later retune the cadence in one place.

flux

option task = {
    name: "continuous_downsample",
    every: 15m,
    offset: 5m,
}

from(bucket: "raw_telemetry")
    |> range(start: -task.every)
    |> filter(fn: (r) => r._measurement == "vibration_metrics")
    |> aggregateWindow(every: 1m, fn: max, createEmpty: false)
    |> to(bucket: "downsampled_telemetry")

3. Size the window for late-arriving data

If gateways buffer and flush late, the window must span both the cadence and the maximum expected lateness. Decouple the read window from task.every and add a lateness margin, then deduplicate on write. Overlapping windows are safe only when the destination write is idempotent — InfluxDB overwrites points with identical measurement, tag set, field, and timestamp, so an aggregate keyed to a deterministic window boundary is naturally idempotent.

flux

option task = {
    name: "downsample_late_safe",
    every: 15m,
    offset: 10m,
}

// Read 25m to cover the 15m cadence plus a 10m lateness margin.
lookback = 25m

from(bucket: "raw_telemetry")
    |> range(start: -lookback)
    |> filter(fn: (r) => r._measurement == "vibration_metrics")
    // Snap to fixed 1m boundaries so re-processed windows overwrite, not duplicate.
    |> aggregateWindow(every: 1m, fn: max, createEmpty: false)
    |> to(bucket: "downsampled_telemetry")

The idempotency reasoning above is expanded in Flux scripting for task automation, which covers writing robust, replay-safe rollup scripts.

4. Provision tasks programmatically

For environments that manage tasks as code, create them through the client rather than the UI so definitions live in version control. This is the entry point to the broader Python client orchestration patterns.

python

import os
from influxdb_client import InfluxDBClient, TaskCreateRequest

client = InfluxDBClient(
    url=os.environ["INFLUX_URL"],
    token=os.environ["INFLUX_TOKEN"],
    org=os.environ["INFLUX_ORG"],
)

flux = """
option task = {name: "edge_anomaly_check", every: 5m, offset: 30s}

from(bucket: "telemetry")
    |> range(start: -5m)
    |> filter(fn: (r) => r._field == "temperature")
    |> mean()
    |> to(bucket: "alerts")
"""

tasks_api = client.tasks_api()
created = tasks_api.create_task(
    task=TaskCreateRequest(
        org_id=os.environ["INFLUX_ORG_ID"],
        flux=flux,
        description="Continuous anomaly detection for edge sensors",
        status="active",
    )
)
print(f"Task scheduled: {created.id}")

Configuration reference

Option	Accepted values	Default	Effect
`name`	string	— (required)	Human-readable task identifier; must be unique per organization.
`cron`	5- or 6-field cron expression	—	Fires on a UTC calendar grid. Mutually exclusive with `every`.
`every`	duration literal (`30s`, `15m`, `1h`, `1d`)	—	Fires every fixed duration, snapped to the boundary. Mutually exclusive with `cron`.
`offset`	duration literal	`0s`	Delays execution after the computed fire time; does not shift the query window. Lets buffers drain.
`concurrency`	integer (2.x builds that expose it)	`1`	Max simultaneous runs per task. Left at `1`, late runs queue instead of overlapping.
`retry`	integer (where supported)	engine default	Number of automatic retries for a failed run before it is marked errored.

Duration literals accept ns, us, ms, s, m, h, d, w, mo, y. Cron seconds are optional; a six-field expression enables sub-minute calendar alignment where the build supports it.

Common failure modes and fixes

1. Offset too small for late IoT data. Symptom: aggregates are systematically low or show gaps for the most recent window, only under real network conditions. Root cause: the task reads its window before buffered gateway batches have flushed. Fix: measure the 99th-percentile arrival lag and set offset (and, if needed, the read lookback) to exceed it.

flux

// Before: offset: 1m  — reads too early
// After: offset covers observed p99 lateness
option task = {name: "downsample", every: 15m, offset: 10m}

2. Runs pile up because the query outlasts the interval. Symptom: the _tasks history shows growing scheduled-to-started latency; execution timestamps drift later each cycle. Root cause: the query takes longer than every, so with default concurrency: 1 each run waits for the last to finish. Fix: widen the interval, narrow the query (tighter filter, coarser aggregateWindow), or split the work across several smaller tasks — see dependency mapping & DAG construction for decomposing one heavy task into a staged graph.

3. DST shift moves a “9 AM local” report by an hour. Symptom: a report expected at 09:00 local arrives at 08:00 or 10:00 half the year. Root cause: cron evaluates in UTC and has no DST awareness. Fix: pin the expression to UTC deliberately and, where the exact local hour is contractual, gate output inside the query rather than trusting the cron field. The timezone-aware cron templates page gives copy-paste patterns.

4. Duplicate writes after a widened, overlapping window. Symptom: aggregate values look inflated after you extended lookback. Root cause: overlapping windows re-emit points with slightly different timestamps, so they append instead of overwrite. Fix: snap every emitted point to a deterministic boundary (aggregateWindow(every: 1m, ...)) so re-processing overwrites the identical series/timestamp key.

5. Interval task never fires on the boundary you expect. Symptom: an every: 1h task fires at odd minutes. Root cause: clock drift on the node, or confusing offset with the fire time. Fix: verify NTP sync, and remember the fire time is boundary + offset, not boundary.

Verification and testing

Confirm real behavior by querying the _tasks system bucket for run history rather than trusting the UI’s green checkmarks. This surfaces schedule-to-start latency, which is the earliest warning of piling-up runs:

flux

import "influxdata/influxdb/tasks"

tasks.lastSuccess(orTime: -1h)

For a fuller history, read the task runs directly and inspect the delta between scheduledFor and startedAt:

flux

from(bucket: "_tasks")
    |> range(start: -24h)
    |> filter(fn: (r) => r._measurement == "runs")
    |> filter(fn: (r) => r.taskID == "TASK_ID_HERE")

Add a deadman health check so a task that stops producing output raises an alert instead of failing silently. The pattern below flags the destination bucket if no points have landed in the last two cadences:

flux

import "influxdata/influxdb/monitor"
import "experimental"

from(bucket: "downsampled_telemetry")
    |> range(start: -30m)
    |> filter(fn: (r) => r._measurement == "vibration_metrics")
    |> monitor.deadman(t: experimental.subDuration(from: now(), d: 30m))
    |> filter(fn: (r) => r.dead == true)

From the CLI, a quick way to confirm a task exists and is active before trusting it:

bash

influx task list --org "$INFLUX_ORG"

Integration points

Scheduling is the trigger layer beneath the rest of the automation stack. The what runs — the transformation itself — belongs to Flux scripting for task automation; this page owns only the when. When a single trigger must fan out into ordered stages (raw → hourly → daily), the ordering is modeled in dependency mapping & DAG construction rather than by stacking offsets. Once cadences are set, the aggregates they produce feed the retention tiers defined in retention policy design, where bucket-expiration windows must be at least as long as the slowest rollup’s coverage. And for pipelines where a source occasionally reports nothing, pair interval scheduling with the fallback chains for missing data patterns so an empty window degrades gracefully instead of writing a gap.

FAQ

Can a single task use both `cron` and `every`?

No. The option task record accepts exactly one of the two. Supplying both is a validation error at task creation. Choose cron for calendar-anchored timestamps and every for spacing-sensitive cadences.

Does `offset` change which data my query reads?

No. offset only delays execution past the computed fire time; the query window (range(start: -task.every) or an explicit range) is unchanged. That separation is exactly what lets ingestion buffers drain before the window is read.

Why does my interval task fire at boundaries instead of relative to the last run?

Interval fire times are snapped to the duration boundary measured from a zero epoch, not accumulated from the previous run’s finish. This makes fire times deterministic and identical across synchronized nodes, which prevents duplicate execution in replicated topologies.

How do I schedule for a local timezone with daylight saving?

InfluxDB evaluates cron in UTC and has no per-task timezone. Translate the desired local time to UTC, and where the exact local hour is contractual across DST, gate the output inside the Flux query. The timezone-aware cron page has ready templates.

What happens if a run takes longer than the interval?

With the default single-run concurrency, the next scheduled run queues until the current one completes, so runs serialize rather than overlap. Sustained overruns show as growing schedule-to-start latency in the _tasks bucket — the trigger to narrow the query, widen the interval, or split the task into a staged graph.

Configuring cron expressions for timezone-aware InfluxDB tasks — DST-safe expression templates and validation.
Flux scripting for task automation — the transformation logic a schedule triggers.
Dependency mapping & DAG construction — sequencing multi-stage pipelines beyond a single trigger.
Python client orchestration patterns — provisioning and rotating tasks as code.
Downsampling & aggregation pipeline design — where interval-driven rollups fit in the wider lifecycle.

Up one level: Automated Task Scheduling & Orchestration

# Cron & Interval Scheduling Logic

# The failure scenario this solves

# Prerequisites

# Core concept: how the two triggers anchor time

# Step-by-step implementation

# 1. Define a calendar-aligned cron task

# 2. Define a drift-resistant interval task

# 3. Size the window for late-arriving data

# 4. Provision tasks programmatically

# Configuration reference

# Common failure modes and fixes

# Verification and testing

# Integration points

# FAQ

# Can a single task use both cron and every?

# Does offset change which data my query reads?

# Why does my interval task fire at boundaries instead of relative to the last run?

# How do I schedule for a local timezone with daylight saving?

# What happens if a run takes longer than the interval?

# Related

Explore this section

Related pages

Cron & Interval Scheduling Logic

The failure scenario this solves

Prerequisites

Core concept: how the two triggers anchor time

Step-by-step implementation

1. Define a calendar-aligned cron task

2. Define a drift-resistant interval task

3. Size the window for late-arriving data

4. Provision tasks programmatically

Configuration reference

Common failure modes and fixes

Verification and testing

Integration points

FAQ

Can a single task use both `cron` and `every`?

Does `offset` change which data my query reads?

Why does my interval task fire at boundaries instead of relative to the last run?

How do I schedule for a local timezone with daylight saving?

What happens if a run takes longer than the interval?

Related