Hệ thống Lịch Backup (Schedule System)
Giới thiệu
Trong bài này, chúng ta sẽ tìm hiểu về hệ thống lịch backup sử dụng cron jobs, tính toán nextRunAt, và quản lý status states.
BackupSchedule Model
Schema Structure
typescript
interface IBackupSchedule {
_id?: string;
resourceType: 'cloud-server' | 'vps';
resourceId: string;
serverId?: string; // OpenStack server ID
serverName?: string;
customerEmail: string;
location?: string; // 'HCM' or 'HNI'
// Schedule configuration
startHour: number; // 0-23
intervalDays: number; // 1 (daily), 7 (weekly), 30 (monthly)
retain: number; // Number of backups to keep
timezone: string; // 'Asia/Ho_Chi_Minh'
// Template reference
scheduleTemplateId?: string;
// Status tracking
status: 'idle' | 'pending' | 'running' | 'completed' | 'failed';
enabled: boolean;
// Timestamps
lastRunAt?: Date;
nextRunAt?: Date;
startedAt?: Date;
completedAt?: Date;
failedAt?: Date;
// Job tracking
runningJobId?: string;
retryCount: number;
lastError?: string;
}Cron-Based Scheduler
Scheduler Implementation
typescript
import cron from 'node-cron';
import { backupQueueManager } from './BackupQueueManager';
import { scheduleDueJobs, recoverInterruptedJobs } from '../client/BackupScheduleService';
export async function startBackupScheduler() {
// Initialize queue manager
await backupQueueManager.initialize();
// Recover interrupted jobs from previous restart
await recoverInterruptedJobs();
// Main scheduler - check for due schedules every hour at minute 5
const schedulerTask = cron.schedule('5 * * * *', async () => {
const now = new Date();
logger.info(`🔄 Backup scheduler tick at ${now.toISOString()}`);
try {
const queuedJobs = await scheduleDueJobs(now);
logger.info(`✅ Backup scheduler completed - Queued ${queuedJobs} jobs`);
} catch (e: any) {
logger.error(`❌ Backup scheduler error: ${e?.message || e}`);
}
});
// Cleanup task - run daily at 2 AM
const cleanupTask = cron.schedule('0 2 * * *', async () => {
logger.info('🧹 Running daily cleanup...');
await backupQueueManager.cleanOldJobs();
// Reset error states for old failed jobs
// ...
});
schedulerTask.start();
cleanupTask.start();
logger.info('✅ Backup scheduler started');
}Cron Schedule Syntax
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday)
│ │ │ │ │
* * * * *Examples:
'5 * * * *'- Mỗi giờ tại phút thứ 5'0 2 * * *'- Hàng ngày lúc 2h sáng'0 0 * * 0'- Hàng tuần vào Chủ nhật lúc 0h
Calculate nextRunAt
Timezone-Aware Calculation
typescript
import moment from 'moment-timezone';
export function calcNextRunAt(schedule: IBackupSchedule): Date {
const tz = schedule.timezone || 'Asia/Ho_Chi_Minh';
const now = moment.tz(tz);
// Calculate today at startHour
const today = now.clone()
.hour(schedule.startHour)
.minute(0)
.second(0)
.millisecond(0);
// If current time is after today's startHour, schedule for next interval
if (now.isAfter(today)) {
return today.add(schedule.intervalDays, 'days').toDate();
}
// Otherwise, schedule for today
return today.toDate();
}Example
typescript
// Schedule: daily at 2 AM, timezone: Asia/Ho_Chi_Minh
// Current time: 2025-01-25 10:00:00 (after 2 AM)
const nextRunAt = calcNextRunAt({
startHour: 2,
intervalDays: 1,
timezone: 'Asia/Ho_Chi_Minh'
});
// Result: 2025-01-26 02:00:00
// If current time: 2025-01-25 01:00:00 (before 2 AM)
// Result: 2025-01-25 02:00:00Schedule Due Jobs
Check và Queue Due Schedules
typescript
export async function scheduleDueJobs(now: Date): Promise<number> {
let queuedJobs = 0;
// Find due schedules
const cursor = BackupSchedule.find({
enabled: true,
status: { $in: ['idle', 'completed', 'failed'] },
$or: [
{ nextRunAt: { $lte: now } },
{ nextRunAt: null }
]
})
.sort({ nextRunAt: 1, _id: 1 })
.cursor();
for await (const schedule of cursor) {
const tz = schedule.timezone || 'Asia/Ho_Chi_Minh';
const nowTz = moment.tz(now, tz);
// Check if should run this hour
const shouldRunThisHour = nowTz.hour() === schedule.startHour;
// Check days since last run
const lastRun = schedule.lastRunAt
? moment.tz(schedule.lastRunAt, tz)
: null;
const daysSinceLastRun = !lastRun
? Infinity
: nowTz.clone().startOf('day').diff(lastRun.clone().startOf('day'), 'days');
const dueByInterval = daysSinceLastRun >= schedule.intervalDays;
// Skip if not due
if (!(shouldRunThisHour && dueByInterval)) continue;
// Skip if already running or failed too many times
if (schedule.status === 'running' ||
(schedule.status === 'failed' && schedule.retryCount >= 3)) {
continue;
}
// Update status to pending
await BackupSchedule.findByIdAndUpdate(schedule._id, {
$set: {
status: 'pending',
nextRunAt: calcNextRunAt(schedule)
}
});
// Add to queue
if (backupQueueManager.isReady()) {
await backupQueueManager.addBackupJob(schedule);
queuedJobs++;
}
}
return queuedJobs;
}Status States
Status Lifecycle
idle → pending → running → completed → idle
↓
failed → (retry) → running
↓
(max retries) → idleStatus Descriptions
| Status | Description | Next Action |
|---|---|---|
idle | Waiting for next run | Schedule checks every hour |
pending | Queued in BullMQ | Worker picks up job |
running | Backup in progress | Update on completion/failure |
completed | Backup succeeded | Reset to idle |
failed | Backup failed | Retry or reset after max attempts |
Job Recovery
Recover Interrupted Jobs
typescript
export async function recoverInterruptedJobs(): Promise<void> {
// Find jobs that were running when app stopped
const interruptedJobs = await BackupSchedule.find({
status: 'running'
});
for (const job of interruptedJobs) {
// Check if job is still active in queue
if (job.runningJobId && backupQueueManager.isReady()) {
const queue = backupQueueManager.getQueue();
const activeJob = await queue?.getJob(job.runningJobId);
if (activeJob && await activeJob.isActive()) {
// Job still running, skip
continue;
}
}
// Reset status
await BackupSchedule.findByIdAndUpdate(job._id, {
$set: {
status: 'idle',
runningJobId: null,
lastError: null
}
});
// Reschedule if due
const now = new Date();
if (shouldReschedule(job, now)) {
await backupQueueManager.addBackupJob(job, 5000); // 5s delay
}
}
}Best Practices
1. Timezone Handling
typescript
// ✅ DO: Always use timezone-aware moment
const nowTz = moment.tz(now, schedule.timezone || 'Asia/Ho_Chi_Minh');
// ❌ DON'T: Use local time
const now = new Date(); // Uses server timezone2. Status Tracking
typescript
// ✅ DO: Update status at each stage
await BackupSchedule.findByIdAndUpdate(id, {
$set: { status: 'pending' } // When queued
});
// ... later
await BackupSchedule.findByIdAndUpdate(id, {
$set: { status: 'running' } // When started
});3. Calculate nextRunAt Early
typescript
// ✅ DO: Calculate nextRunAt when queuing
await BackupSchedule.findByIdAndUpdate(id, {
$set: {
status: 'pending',
nextRunAt: calcNextRunAt(schedule) // Calculate now
}
});
// ❌ DON'T: Calculate after completion (may miss schedule window)Summary
Key Points
Cron-Based Scheduler
- Check every hour at minute 5
- Daily cleanup at 2 AM
nextRunAt Calculation
- Timezone-aware
- Based on startHour và intervalDays
Status Management
- Track status at each stage
- Recover interrupted jobs
Due Check Logic
- Check hour match
- Check days since last run
- Skip running/failed jobs
Next Steps
Trong bài tiếp theo, chúng ta sẽ tìm hiểu về:
- Queue Management - BullMQ integration và job processing
Last Updated: 2025-01-25
Previous: 03. Snapshot & Backup
Next: 05. Queue Management