System Design: Data Upload & PDF Report Generation System

Executive Summary

Transform the current environmental monitoring dashboard from a static data viewer into a dynamic data management system with PDF report generation capabilities. Users will be able to upload daily/weekly data, generate automated reports with visualizations, and manage their historical data through a comprehensive management interface.

Current Status: Phase 5+ complete - Extended features including Dust Levels visualization, Weekly Reports, Flow Meter tracking, Climate integration, AI-powered chart descriptions, and external scraper integrations.

Feature Modules Overview

The application has 9 main feature modules in /src/features/:

Module	Description	Status
dust-levels	Dust monitoring data visualization and PDF export	✅ Complete
weekly-reports	Structured weekly field report generation with auto-save	✅ Complete
flow-meter	Water usage tracking and refill management	✅ Complete
climate	Weather station data integration	🔨 Framework
report-templates	Customizable markdown-based report templates	✅ Complete
reports	PDF report generation from CSV uploads	✅ Complete
upload	CSV data upload and parsing	✅ Complete
dashboard	Analytics and KPI dashboards	✅ Complete
mine-sites	Mine site location and asset management	✅ Complete

1. System Architecture

1.1 High-Level Architecture (Supabase-Based)

┌─────────────────────────────────────────────────────────────────┐
│                     Frontend (React + Vite)                      │
├─────────────────────────────────────────────────────────────────┤
│  Upload UI  │  Report Viewer  │  Data Management  │  Dashboard  │
└──────────┬──────────────────────────────────────────────────────┘
           │
           ├─── Supabase Client ────┐
           │                         │
           ▼                         ▼
  ┌──────────────────────┐  ┌────────────────────────┐
  │   Supabase Backend   │  │  PDF Generator Service │
  │                      │  │   (Node.js + Express)  │
  ├──────────────────────┤  │  - Puppeteer           │
  │ • Auth (JWT)         │  │  - Report Generation   │
  │ • PostgreSQL DB      │◄─┤  - Chart Rendering     │
  │ • Storage (S3-like)  │  └────────────────────────┘
  │ • Realtime           │
  │ • Row Level Security │
  │ • Edge Functions     │
  └──────────────────────┘
           │
           ▼
  ┌──────────────────────┐
  │  Storage Buckets     │
  │  • CSV Uploads       │
  │  • PDF Reports       │
  └──────────────────────┘

Architecture Benefits:

Supabase Auth: Built-in user authentication with JWT, social logins, email verification
PostgreSQL: More powerful than SQLite, supports JSON, full-text search, complex queries
Supabase Storage: S3-compatible storage with CDN, access control, image transforms
Row-Level Security (RLS): Database-level data isolation per user
Real-time: Optional real-time subscriptions for live updates
Edge Functions: Serverless functions for custom logic (alternative to Node.js service)
Auto-generated API: REST and GraphQL APIs auto-generated from database schema

1.2 Technology Stack Changes

Current (Client-Only):

React frontend with sql.js (in-browser SQLite)
Static database file loaded from /data/

New (Supabase Full-Stack):

Frontend: React + Vite (unchanged)
Backend-as-a-Service: Supabase
- Database: PostgreSQL (Supabase-hosted)
- Authentication: Supabase Auth (JWT-based, built-in)
- File Storage: Supabase Storage (S3-compatible)
- API: Auto-generated REST/GraphQL API
- Security: Row-Level Security (RLS) policies
- Real-time: Optional real-time subscriptions
PDF Generation Service: Node.js + Express + TypeScript (separate microservice)
- Rendering: Puppeteer + Recharts
- Deployment: Docker container, Vercel, or Supabase Edge Functions
API Communication:
- Supabase JS Client for database/auth/storage
- Axios for PDF generation service
- TanStack Query for data fetching

1.3 Authentication & User Management (Supabase Auth)

Overview

Using Supabase Auth eliminates the need to build custom authentication. Supabase provides battle-tested authentication with JWT tokens, social logins, email verification, and password reset out of the box.

Authentication Flow with Supabase

┌─────────────────────────────────────────────────────────────┐
│                    Registration Flow                         │
└─────────────────────────────────────────────────────────────┘
User → Register Form → supabase.auth.signUp() → Supabase Auth → Email Verification → User Created

┌─────────────────────────────────────────────────────────────┐
│                      Login Flow                              │
└─────────────────────────────────────────────────────────────┘
User → Login Form → supabase.auth.signInWithPassword() → Supabase Auth → JWT + Session → Local Storage

┌─────────────────────────────────────────────────────────────┐
│                   Authenticated Request Flow                 │
└─────────────────────────────────────────────────────────────┘
Frontend → Supabase Client (auto-includes JWT) → Supabase API → RLS Policies Check → Response

┌─────────────────────────────────────────────────────────────┐
│                    Social Login Flow (Optional)              │
└─────────────────────────────────────────────────────────────┘
User → Click "Login with Google" → supabase.auth.signInWithOAuth() → OAuth Provider → Callback → Logged In

Supabase Auth Features Used

Email/Password Authentication:
- signUp() - Register new users
- signInWithPassword() - Login
- signOut() - Logout
- resetPasswordForEmail() - Password reset
Social Authentication (Optional):
- Google, GitHub, GitLab, Bitbucket, etc.
- One-click setup in Supabase dashboard
Session Management:
- Automatic JWT refresh
- Persistent sessions (localStorage/cookies)
- onAuthStateChange() listener for session changes
Email Verification:
- Automatic email sending
- Customizable email templates
- Email confirmation link
User Metadata:
- Store additional user info (full_name, organization)
- Accessible via user.user_metadata

User Roles Implementation

Roles stored in custom user_profiles table and synced with Supabase Auth:

User (Regular):
- Upload data (own data only via RLS)
- Generate reports
- View/download own reports
- Manage own data
- View own dashboard
Admin:
- All user permissions
- View all users' data (bypass RLS)
- Manage users (create, edit, delete)
- System settings
- View system analytics

Row-Level Security (RLS) Policies

Database-level security ensures users can only access their own data:

sql

-- Users can only see their own uploads
CREATE POLICY "Users can view own uploads"
  ON uploads FOR SELECT
  USING (auth.uid() = user_id);

-- Users can only insert their own uploads
CREATE POLICY "Users can create own uploads"
  ON uploads FOR INSERT
  WITH CHECK (auth.uid() = user_id);

-- Users can only delete their own uploads
CREATE POLICY "Users can delete own uploads"
  ON uploads FOR DELETE
  USING (auth.uid() = user_id);

-- Admins can view all data
CREATE POLICY "Admins can view all uploads"
  ON uploads FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

Session Management

JWT Tokens: Automatically managed by Supabase client
Token Expiration: 1 hour (auto-refreshed)
Refresh Tokens: 7 days (can be configured)
Token Storage: localStorage (secure, HTTPS only)
Token Revocation: signOut() invalidates session

2. Core Features & User Flows

2.1 Feature: Data Upload (ZIP with Site Folder Structure)

User Story:As a user, I want to upload daily or weekly measurement data as a ZIP package containing site folders with CSV files, so the system can store it and generate reports.

Flow:

User navigates to /upload page
User selects upload type: Daily or Weekly
User selects date (for daily) or week range (for weekly)
User uploads ZIP file with site folder structure
System validates ZIP structure:
- Extract ZIP contents
- Verify first-level folder structure (site folders)
- Validate CSV files inside each site folder
System processes each site folder:
- Extract site name from folder name (e.g., "Boddington")
- Parse all CSV files in the folder
- Validate data format and integrity
- Import measurements to database with site association
System automatically generates PDF report (one per site)
System notifies user of successful upload with summary
If upload exists for this date/week/site → overwrite previous data and report

ZIP Package Structure Requirements:

your-data.zip
├── Boddington/                    # Site folder (first level)
│   ├── device-001-20250115.csv
│   ├── device-002-20250115.csv
│   ├── device-003-20250115.csv
│   └── ...
├── SuperPit/                      # Another site folder
│   ├── device-004-20250115.csv
│   └── device-005-20250115.csv
└── ...

Important Rules:

First level must be site folders (no loose files)
Folder names = Site names (e.g., "Boddington", "SuperPit", "Kalgoorlie")
Each site folder contains CSV files (measurement data)
Multiple CSV files per site are supported (different devices)
ZIP file size limit: 200MB (increased for multiple sites/files)

CSV Format Requirements (Inside Site Folders):

csv

loc,time,speed,massconcentration10p0,massconcentration2p5,massconcentration1p0,roadtemperature,batterytemperature,humidity,roadbumpscount,site,source_file
...

Note: The site column in CSV should match the folder name, but the folder name takes precedence if they differ.

Validation Rules:

ZIP Structure Validation:
- First level must only contain folders (no files)
- Folder names must be valid (alphanumeric, spaces, hyphens allowed)
- Each folder must contain at least one CSV file
- No nested folders beyond first level
CSV File Validation:
- Required columns: loc, time, site, source_file
- At least one PM measurement column
- Date/time format: DD/MM/YYYY HH:MM:SS
- Numeric values for all measurement fields
- No empty CSV files
Data Consistency Validation:
- All CSV files in a site folder should be for the same date/week period
- Device identifiers (source_file) should be unique within a site
- Time ranges should not overlap for same device
Size Limits:
- Maximum ZIP file size: 200MB
- Maximum individual CSV file: 50MB
- Maximum number of sites per ZIP: 20
- Maximum CSV files per site: 50

Error Handling:

Invalid ZIP structure → Show error with expected structure
Missing CSV files in a folder → Skip that folder, warn user
Invalid CSV format → Show which file(s) have errors
Partial success → Import successful sites, report failed ones

Upload Progress Feedback:

Step 1: Uploading ZIP file... [████████████] 100%
Step 2: Extracting ZIP... [████████████] 100%
Step 3: Validating structure... [████████████] 100%
Step 4: Processing sites:
  - Boddington: [████████████] 100% (3 CSV files, 1,245 records)
  - SuperPit: [████████████] 100% (2 CSV files, 892 records)
Step 5: Generating reports:
  - Boddington: [████████████] Complete
  - SuperPit: [████████████] Complete
✅ Upload successful! Imported 2,137 records from 2 sites.

2.2 Feature: Report Management

User Story:As a user, I want to view and manage all my uploaded data reports organized by date.

Flow:

User navigates to /reports page
System displays calendar/list view with:
- Daily reports (highlighted dates)
- Weekly reports (highlighted week ranges)
User can:
- Filter by time period (month, quarter, year)
- Switch between daily/weekly view
- Search by site or device
User clicks on a date/week
System displays:
- Data summary statistics
- Download PDF button
- View PDF inline button
- Delete data button
- Re-generate report button

Management Actions:

View Report: Open PDF in new tab or inline viewer
Download Report: Download PDF to local machine
Delete Data: Remove data + report (requires confirmation)
Re-generate Report: Rebuild PDF with latest template
Export Data: Download original CSV

2.3 Feature: PDF Report Generation

User Story:As a user, I want automated PDF reports with charts and statistics for each upload.

Report Structure:

┌────────────────────────────────────────┐
│  ENVIRONMENTAL MONITORING REPORT        │
│  Site: [Site Name]                     │
│  Period: [Date or Week Range]          │
│  Generated: [Timestamp]                │
├────────────────────────────────────────┤
│  1. EXECUTIVE SUMMARY                  │
│     - Total Records                    │
│     - Date Range                       │
│     - Devices Included                 │
│     - Key Findings                     │
├────────────────────────────────────────┤
│  2. OVERVIEW STATISTICS                │
│     - Average PM10, PM2.5, PM1.0       │
│     - Temperature Range                │
│     - Humidity Range                   │
│     - Total Road Bumps                 │
├────────────────────────────────────────┤
│  3. DATA VISUALIZATIONS                │
│     Chart 1: PM10 Over Time (Line)     │
│     Chart 2: PM Levels by Site (Bar)   │
│     Chart 3: PM Distribution (Pie)     │
│     Chart 4: Temp vs Humidity (Scatter)│
│     Chart 5: Daily Trends (Line)       │
│     Chart 6: Hourly Averages (Heatmap) │
├────────────────────────────────────────┤
│  4. DETAILED METRICS                   │
│     - Extended Statistics              │
│     - Device Performance               │
│     - Environmental Conditions         │
├────────────────────────────────────────┤
│  5. APPENDIX                           │
│     - Data Quality Notes               │
│     - Methodology                      │
│     - Contact Information              │
└────────────────────────────────────────┘

Technical Approach:

Server-Side Rendering: Use Puppeteer to render React components
Chart Library: Recharts (better SSR support than Nivo)
Template: Reusable React component for report layout
Styling: Tailwind CSS with print-optimized styles
Page Size: A4 (210mm x 297mm)
Format: PDF/A (archival standard)

3. Data Model (PostgreSQL with Supabase)

3.1 Supabase Auth Tables (Built-in)

Supabase automatically creates auth.users table. We don't modify this table directly. Instead, we create a user_profiles table to store additional user information.

3.2 Custom Database Tables

Table: user_profiles

sql

-- User profile information (extends Supabase auth.users)
CREATE TABLE user_profiles (
  id UUID PRIMARY KEY REFERENCES auth.users(id) ON DELETE CASCADE,
  full_name TEXT NOT NULL,
  organization TEXT,
  role TEXT DEFAULT 'user' CHECK (role IN ('user', 'admin')),
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable RLS
ALTER TABLE user_profiles ENABLE ROW LEVEL SECURITY;

-- RLS Policies
CREATE POLICY "Users can view own profile"
  ON user_profiles FOR SELECT
  USING (auth.uid() = id);

CREATE POLICY "Users can update own profile"
  ON user_profiles FOR UPDATE
  USING (auth.uid() = id);

-- Admins can view all profiles
CREATE POLICY "Admins can view all profiles"
  ON user_profiles FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

-- Indexes
CREATE INDEX idx_user_profiles_role ON user_profiles(role);

-- Trigger to create profile on user signup
CREATE OR REPLACE FUNCTION public.handle_new_user()
RETURNS TRIGGER AS $$
BEGIN
  INSERT INTO public.user_profiles (id, full_name, role)
  VALUES (
    NEW.id,
    COALESCE(NEW.raw_user_meta_data->>'full_name', 'User'),
    'user'
  );
  RETURN NEW;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

CREATE TRIGGER on_auth_user_created
  AFTER INSERT ON auth.users
  FOR EACH ROW EXECUTE FUNCTION public.handle_new_user();

Table: uploads

sql

CREATE TABLE uploads (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE,
  upload_type TEXT NOT NULL CHECK (upload_type IN ('daily', 'weekly')),
  period_start DATE NOT NULL,
  period_end DATE NOT NULL,

  -- ZIP file information
  zip_filename TEXT NOT NULL,
  zip_storage_path TEXT NOT NULL, -- Supabase Storage path to original ZIP
  zip_size_bytes BIGINT NOT NULL,

  -- Processing information
  sites JSONB NOT NULL DEFAULT '[]'::jsonb, -- Array of site names extracted from ZIP
  total_csv_files INTEGER NOT NULL DEFAULT 0,
  total_record_count INTEGER NOT NULL DEFAULT 0,

  -- Status tracking
  upload_timestamp TIMESTAMPTZ DEFAULT NOW(),
  processing_started_at TIMESTAMPTZ,
  processing_completed_at TIMESTAMPTZ,
  status TEXT DEFAULT 'uploaded' CHECK (status IN ('uploaded', 'extracting', 'validating', 'importing', 'generating_reports', 'completed', 'failed', 'partial_success')),
  error_message TEXT,
  processing_log JSONB, -- Detailed log of processing steps

  -- Summary statistics
  successful_sites INTEGER DEFAULT 0,
  failed_sites INTEGER DEFAULT 0,
  sites_summary JSONB -- Per-site summary: {site: {csv_count, record_count, status, error}}
);

-- Enable RLS
ALTER TABLE uploads ENABLE ROW LEVEL SECURITY;

-- RLS Policies
CREATE POLICY "Users can view own uploads"
  ON uploads FOR SELECT
  USING (auth.uid() = user_id);

CREATE POLICY "Users can create own uploads"
  ON uploads FOR INSERT
  WITH CHECK (auth.uid() = user_id);

CREATE POLICY "Users can update own uploads"
  ON uploads FOR UPDATE
  USING (auth.uid() = user_id);

CREATE POLICY "Users can delete own uploads"
  ON uploads FOR DELETE
  USING (auth.uid() = user_id);

-- Admin policies
CREATE POLICY "Admins can view all uploads"
  ON uploads FOR ALL
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

-- Indexes
CREATE INDEX idx_uploads_user ON uploads(user_id);
CREATE INDEX idx_uploads_period ON uploads(period_start, period_end);
CREATE INDEX idx_uploads_sites ON uploads USING GIN(sites); -- GIN index for JSONB array
CREATE INDEX idx_uploads_status ON uploads(status);
CREATE INDEX idx_uploads_type ON uploads(upload_type);
CREATE INDEX idx_uploads_timestamp ON uploads(upload_timestamp DESC);

Table: upload_sites (New - for better querying per site)

sql

-- Normalized table for easier querying of individual sites within an upload
CREATE TABLE upload_sites (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  upload_id UUID NOT NULL REFERENCES uploads(id) ON DELETE CASCADE,
  site_name TEXT NOT NULL,
  csv_files JSONB NOT NULL DEFAULT '[]'::jsonb, -- Array of CSV filenames processed
  csv_count INTEGER NOT NULL DEFAULT 0,
  record_count INTEGER NOT NULL DEFAULT 0,
  devices JSONB NOT NULL DEFAULT '[]'::jsonb, -- Array of device IDs (source_file values)
  status TEXT DEFAULT 'pending' CHECK (status IN ('pending', 'processing', 'completed', 'failed')),
  error_message TEXT,
  processed_at TIMESTAMPTZ,
  CONSTRAINT unique_upload_site UNIQUE(upload_id, site_name)
);

-- Enable RLS
ALTER TABLE upload_sites ENABLE ROW LEVEL SECURITY;

-- RLS Policies (inherited through uploads)
CREATE POLICY "Users can view own upload sites"
  ON upload_sites FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM uploads
      WHERE uploads.id = upload_sites.upload_id
      AND uploads.user_id = auth.uid()
    )
  );

-- Admin policy
CREATE POLICY "Admins can view all upload sites"
  ON upload_sites FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

-- Indexes
CREATE INDEX idx_upload_sites_upload ON upload_sites(upload_id);
CREATE INDEX idx_upload_sites_site ON upload_sites(site_name);
CREATE INDEX idx_upload_sites_status ON upload_sites(status);

Table: reports

sql

CREATE TABLE reports (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  upload_id UUID NOT NULL REFERENCES uploads(id) ON DELETE CASCADE,
  upload_site_id UUID NOT NULL REFERENCES upload_sites(id) ON DELETE CASCADE, -- Link to specific site
  site_name TEXT NOT NULL, -- Denormalized for easier querying
  pdf_filename TEXT NOT NULL,
  pdf_storage_path TEXT NOT NULL, -- Supabase Storage path
  generation_timestamp TIMESTAMPTZ DEFAULT NOW(),
  generation_duration_ms INTEGER, -- Time taken to generate PDF
  file_size_bytes BIGINT NOT NULL,
  page_count INTEGER NOT NULL DEFAULT 0,
  version INTEGER DEFAULT 1,
  status TEXT DEFAULT 'pending' CHECK (status IN ('pending', 'generating', 'completed', 'failed')),
  error_message TEXT
);

-- Enable RLS
ALTER TABLE reports ENABLE ROW LEVEL SECURITY;

-- RLS Policies - ALL USERS CAN VIEW ALL REPORTS (PUBLIC REPORTS)
CREATE POLICY "All authenticated users can view all reports"
  ON reports FOR SELECT
  USING (auth.uid() IS NOT NULL); -- Any logged-in user can view

-- Only owner can delete reports
CREATE POLICY "Users can delete own reports"
  ON reports FOR DELETE
  USING (
    EXISTS (
      SELECT 1 FROM uploads
      WHERE uploads.id = reports.upload_id
      AND uploads.user_id = auth.uid()
    )
  );

-- Only owner can update reports (for re-generation)
CREATE POLICY "Users can update own reports"
  ON reports FOR UPDATE
  USING (
    EXISTS (
      SELECT 1 FROM uploads
      WHERE uploads.id = reports.upload_id
      AND uploads.user_id = auth.uid()
    )
  );

-- Admin policy for full access
CREATE POLICY "Admins have full access to reports"
  ON reports FOR ALL
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

-- Indexes
CREATE INDEX idx_reports_upload ON reports(upload_id);
CREATE INDEX idx_reports_upload_site ON reports(upload_site_id);
CREATE INDEX idx_reports_site ON reports(site_name);
CREATE INDEX idx_reports_timestamp ON reports(generation_timestamp DESC);
CREATE INDEX idx_reports_status ON reports(status);

Table: measurements

sql

CREATE TABLE measurements (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  upload_id UUID NOT NULL REFERENCES uploads(id) ON DELETE CASCADE,
  upload_site_id UUID NOT NULL REFERENCES upload_sites(id) ON DELETE CASCADE, -- Link to specific site
  csv_filename TEXT NOT NULL, -- Which CSV file this measurement came from
  loc TEXT,
  time TIMESTAMPTZ NOT NULL,
  speed NUMERIC,
  massconcentration1p0 NUMERIC,
  massconcentration2p5 NUMERIC,
  massconcentration4p0 NUMERIC,
  massconcentration10p0 NUMERIC,
  numberconcentrations0p5 NUMERIC,
  numberconcentrations1p0 NUMERIC,
  numberconcentrations2p5 NUMERIC,
  numberconcentrations4p0 NUMERIC,
  numberconcentrations10p0 NUMERIC,
  typicalparticlesize NUMERIC,
  roadtemperature NUMERIC,
  batterytemperature NUMERIC,
  humidity NUMERIC,
  roadbumpscount INTEGER,
  site TEXT NOT NULL, -- Denormalized from upload_sites for easier querying
  source_file TEXT NOT NULL -- Device identifier (DustRanger device ID)
);

-- Enable RLS
ALTER TABLE measurements ENABLE ROW LEVEL SECURITY;

-- RLS Policies (inherited through uploads table)
CREATE POLICY "Users can view own measurements"
  ON measurements FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM uploads
      WHERE uploads.id = measurements.upload_id
      AND uploads.user_id = auth.uid()
    )
  );

-- Admin policy
CREATE POLICY "Admins can view all measurements"
  ON measurements FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

-- Indexes for performance
CREATE INDEX idx_measurements_upload ON measurements(upload_id);
CREATE INDEX idx_measurements_upload_site ON measurements(upload_site_id);
CREATE INDEX idx_measurements_time ON measurements(time);
CREATE INDEX idx_measurements_site ON measurements(site);
CREATE INDEX idx_measurements_site_time ON measurements(site, time);
CREATE INDEX idx_measurements_upload_time ON measurements(upload_id, time);
CREATE INDEX idx_measurements_source ON measurements(source_file);

Table: activity_logs (REQUIRED for audit trail)

sql

CREATE TABLE activity_logs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID REFERENCES auth.users(id) ON DELETE SET NULL,
  action TEXT NOT NULL, -- 'create', 'read', 'update', 'delete', 'download', 'generate'
  resource_type TEXT NOT NULL, -- 'upload', 'report', 'measurement', etc.
  resource_id UUID,
  resource_name TEXT, -- e.g., "Boddington Report 2025-01-15"
  details JSONB, -- Additional context: {site, date, file_size, etc.}
  ip_address INET,
  user_agent TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable RLS
ALTER TABLE activity_logs ENABLE ROW LEVEL SECURITY;

-- Users can view own logs
CREATE POLICY "Users can view own logs"
  ON activity_logs FOR SELECT
  USING (auth.uid() = user_id);

-- Admins can view all logs
CREATE POLICY "Admins can view all logs"
  ON activity_logs FOR SELECT
  USING (
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

-- Indexes
CREATE INDEX idx_activity_logs_user ON activity_logs(user_id);
CREATE INDEX idx_activity_logs_created ON activity_logs(created_at DESC);
CREATE INDEX idx_activity_logs_action ON activity_logs(action);
CREATE INDEX idx_activity_logs_resource ON activity_logs(resource_type, resource_id);

-- Function to automatically log report CRUD operations
CREATE OR REPLACE FUNCTION log_report_activity()
RETURNS TRIGGER AS $$
DECLARE
  action_name TEXT;
  report_name TEXT;
BEGIN
  -- Determine action based on trigger operation
  CASE TG_OP
    WHEN 'INSERT' THEN
      action_name := 'create';
      report_name := NEW.site_name || ' Report ' || TO_CHAR(NEW.generation_timestamp, 'YYYY-MM-DD');

      INSERT INTO activity_logs (
        user_id,
        action,
        resource_type,
        resource_id,
        resource_name,
        details
      ) VALUES (
        (SELECT user_id FROM uploads WHERE id = NEW.upload_id),
        action_name,
        'report',
        NEW.id,
        report_name,
        jsonb_build_object(
          'site', NEW.site_name,
          'upload_id', NEW.upload_id,
          'file_size_bytes', NEW.file_size_bytes,
          'page_count', NEW.page_count
        )
      );
      RETURN NEW;

    WHEN 'UPDATE' THEN
      action_name := 'update';
      report_name := NEW.site_name || ' Report ' || TO_CHAR(NEW.generation_timestamp, 'YYYY-MM-DD');

      INSERT INTO activity_logs (
        user_id,
        action,
        resource_type,
        resource_id,
        resource_name,
        details
      ) VALUES (
        (SELECT user_id FROM uploads WHERE id = NEW.upload_id),
        action_name,
        'report',
        NEW.id,
        report_name,
        jsonb_build_object(
          'site', NEW.site_name,
          'old_version', OLD.version,
          'new_version', NEW.version,
          'changes', jsonb_build_object(
            'status_changed', OLD.status <> NEW.status,
            'pdf_changed', OLD.pdf_storage_path <> NEW.pdf_storage_path
          )
        )
      );
      RETURN NEW;

    WHEN 'DELETE' THEN
      action_name := 'delete';
      report_name := OLD.site_name || ' Report ' || TO_CHAR(OLD.generation_timestamp, 'YYYY-MM-DD');

      INSERT INTO activity_logs (
        user_id,
        action,
        resource_type,
        resource_id,
        resource_name,
        details
      ) VALUES (
        (SELECT user_id FROM uploads WHERE id = OLD.upload_id),
        action_name,
        'report',
        OLD.id,
        report_name,
        jsonb_build_object(
          'site', OLD.site_name,
          'file_size_bytes', OLD.file_size_bytes,
          'version', OLD.version
        )
      );
      RETURN OLD;
  END CASE;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

-- Trigger for automatic logging
CREATE TRIGGER report_activity_logger
  AFTER INSERT OR UPDATE OR DELETE ON reports
  FOR EACH ROW EXECUTE FUNCTION log_report_activity();

-- Function to manually log report read/download actions
CREATE OR REPLACE FUNCTION log_report_read(
  p_report_id UUID,
  p_action TEXT -- 'read' or 'download'
)
RETURNS VOID AS $$
DECLARE
  v_report reports%ROWTYPE;
  v_user_id UUID;
  v_report_name TEXT;
BEGIN
  -- Get report details
  SELECT * INTO v_report FROM reports WHERE id = p_report_id;

  IF NOT FOUND THEN
    RAISE EXCEPTION 'Report not found';
  END IF;

  -- Get current user
  v_user_id := auth.uid();

  -- Build report name
  v_report_name := v_report.site_name || ' Report ' || TO_CHAR(v_report.generation_timestamp, 'YYYY-MM-DD');

  -- Log the action
  INSERT INTO activity_logs (
    user_id,
    action,
    resource_type,
    resource_id,
    resource_name,
    details
  ) VALUES (
    v_user_id,
    p_action,
    'report',
    p_report_id,
    v_report_name,
    jsonb_build_object(
      'site', v_report.site_name,
      'upload_id', v_report.upload_id,
      'file_size_bytes', v_report.file_size_bytes
    )
  );
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

Usage Example for Read/Download Logging:

typescript

// Frontend: After user views report
await supabase.rpc('log_report_read', {
  p_report_id: reportId,
  p_action: 'read'
});

// Frontend: After user downloads report
await supabase.rpc('log_report_read', {
  p_report_id: reportId,
  p_action: 'download'
});

3.3 Supabase Storage Structure

Supabase Storage uses buckets (similar to S3 buckets) to organize files. We'll create three buckets:

Bucket: zip-uploads (Private)

zip-uploads/
├── {user_id}/
│   ├── daily/
│   │   ├── 2025-01/
│   │   │   ├── 2025-01-15_{upload_id}.zip        # Original ZIP file
│   │   │   ├── 2025-01-16_{upload_id}.zip
│   │   │   └── ...
│   │   └── 2025-02/
│   │       └── ...
│   └── weekly/
│       ├── 2025-W01/
│       │   └── 2025-W01_{upload_id}.zip
│       └── 2025-W02/
│           └── ...

Bucket: extracted-csv (Private - Optional for debugging)

extracted-csv/
├── {user_id}/
│   └── {upload_id}/
│       ├── Boddington/                            # Extracted site folders
│       │   ├── device-001-20250115.csv
│       │   ├── device-002-20250115.csv
│       │   └── ...
│       └── SuperPit/
│           ├── device-004-20250115.csv
│           └── ...

Note: Extracted CSVs can be stored temporarily for processing and then deleted, or kept for debugging/audit purposes.

Bucket: pdf-reports (Private)

pdf-reports/
├── {user_id}/
│   ├── daily/
│   │   ├── 2025-01/
│   │   │   ├── 2025-01-15_Boddington_v1_{report_id}.pdf
│   │   │   ├── 2025-01-15_SuperPit_v1_{report_id}.pdf    # One PDF per site
│   │   │   ├── 2025-01-16_Boddington_v1_{report_id}.pdf
│   │   │   └── ...
│   │   └── 2025-02/
│   │       └── ...
│   └── weekly/
│       ├── 2025-W01/
│       │   ├── 2025-W01_Boddington_v1_{report_id}.pdf
│       │   ├── 2025-W01_SuperPit_v1_{report_id}.pdf
│       │   └── ...
│       └── 2025-W02/
│           └── ...

Key Changes:

One ZIP file per upload contains all sites
Multiple PDF reports per upload (one per site)
Site name included in PDF filename for clarity

Storage Bucket Configuration:

typescript

// Create buckets in Supabase Dashboard or via SQL:

-- ZIP Uploads Bucket (Private)
INSERT INTO storage.buckets (id, name, public, file_size_limit, allowed_mime_types)
VALUES (
  'zip-uploads',
  'zip-uploads',
  false,
  209715200, -- 200MB
  ARRAY['application/zip', 'application/x-zip-compressed']
);

-- Extracted CSV Bucket (Private, optional)
INSERT INTO storage.buckets (id, name, public, file_size_limit)
VALUES (
  'extracted-csv',
  'extracted-csv',
  false,
  52428800 -- 50MB per CSV
);

-- PDF Reports Bucket (Private with signed URLs)
INSERT INTO storage.buckets (id, name, public, file_size_limit, allowed_mime_types)
VALUES (
  'pdf-reports',
  'pdf-reports',
  false,
  20971520, -- 20MB
  ARRAY['application/pdf']
);

-- Storage Policies for zip-uploads
CREATE POLICY "Users can upload own ZIP files"
  ON storage.objects FOR INSERT
  WITH CHECK (
    bucket_id = 'zip-uploads' AND
    auth.uid()::text = (storage.foldername(name))[1]
  );

CREATE POLICY "Users can view own ZIP files"
  ON storage.objects FOR SELECT
  USING (
    bucket_id = 'zip-uploads' AND
    auth.uid()::text = (storage.foldername(name))[1]
  );

CREATE POLICY "Users can delete own ZIP files"
  ON storage.objects FOR DELETE
  USING (
    bucket_id = 'zip-uploads' AND
    auth.uid()::text = (storage.foldername(name))[1]
  );

-- Storage Policies for extracted-csv (backend service only)
CREATE POLICY "Service can manage extracted CSV files"
  ON storage.objects FOR ALL
  USING (bucket_id = 'extracted-csv');

-- Storage Policies for pdf-reports
CREATE POLICY "Users can view own PDF reports"
  ON storage.objects FOR SELECT
  USING (
    bucket_id = 'pdf-reports' AND
    auth.uid()::text = (storage.foldername(name))[1]
  );

CREATE POLICY "Users can delete own PDF reports"
  ON storage.objects FOR DELETE
  USING (
    bucket_id = 'pdf-reports' AND
    auth.uid()::text = (storage.foldername(name))[1]
  );

-- PDF Generator service can insert (using service role key)
CREATE POLICY "Service can upload PDF reports"
  ON storage.objects FOR INSERT
  WITH CHECK (bucket_id = 'pdf-reports');

-- Admin policies
CREATE POLICY "Admins can view all files"
  ON storage.objects FOR SELECT
  USING (
    (bucket_id IN ('zip-uploads', 'extracted-csv', 'pdf-reports')) AND
    EXISTS (
      SELECT 1 FROM user_profiles
      WHERE id = auth.uid() AND role = 'admin'
    )
  );

File Access Methods:

Upload ZIP: supabase.storage.from('zip-uploads').upload(path, zipFile)
Download ZIP: supabase.storage.from('zip-uploads').download(path)
Download PDF: supabase.storage.from('pdf-reports').download(path)
Get Signed URL: supabase.storage.from('pdf-reports').createSignedUrl(path, 3600) (1 hour)
Delete: supabase.storage.from('zip-uploads').remove([path])

Processing Flow:

User uploads ZIP → stored in zip-uploads bucket
Backend extracts ZIP → temporarily stores CSVs in extracted-csv (optional)
Backend processes CSVs → inserts into database
Backend generates PDFs → stores in pdf-reports bucket
Backend optionally deletes extracted CSVs (to save space)

Benefits:

Type Safety: MIME type validation at storage level
Size Limits: Enforced at bucket level
Automatic CDN: Fast global delivery
Access Control: RLS at storage level
Audit Trail: All file operations logged
Signed URLs: Temporary secure access for downloads

4. API Design

4.1 Upload Endpoints

POST /api/uploads

typescript

Request:
- Headers: Content-Type: multipart/form-data
- Body:
  - file: File (CSV)
  - uploadType: 'daily' | 'weekly'
  - periodStart: string (ISO date)
  - periodEnd: string (ISO date)
  - site: string

Response:
{
  success: boolean
  uploadId: number
  recordsImported: number
  reportId: number
  reportUrl: string
  message: string
}

Error Codes:
- 400: Invalid file format or missing fields
- 409: Duplicate upload (will overwrite)
- 500: Server error

GET /api/uploads

typescript

Query Parameters:
- type?: 'daily' | 'weekly'
- site?: string
- startDate?: string
- endDate?: string
- limit?: number
- offset?: number

Response:
{
  uploads: Upload[]
  total: number
  hasMore: boolean
}

GET /api/uploads/:id

typescript

Response:
{
  upload: Upload
  report: Report
  statistics: {
    recordCount: number
    dateRange: { start: string, end: string }
    devices: string[]
    avgPm10: number
    // ... more stats
  }
}

DELETE /api/uploads/:id

typescript

Response:
{
  success: boolean
  message: string
  deletedRecords: number
}

4.2 Report Endpoints

POST /api/reports/generate/:uploadId

typescript

Response:
{
  reportId: number
  pdfUrl: string
  generationTime: number (ms)
  fileSize: number
  pageCount: number
}

GET /api/reports/:id

typescript

Response: PDF file (application/pdf)
Headers:
  - Content-Disposition: inline; filename="report.pdf"
  - Content-Type: application/pdf

GET /api/reports/:id/download

typescript

Response: PDF file (application/pdf)
Headers:
  - Content-Disposition: attachment; filename="report.pdf"
  - Content-Type: application/pdf

GET /api/reports

typescript

Query Parameters:
- type?: 'daily' | 'weekly'
- site?: string
- startDate?: string
- endDate?: string

Response:
{
  reports: Report[]
  total: number
}

4.3 Dashboard Endpoints (Modified)

GET /api/dashboard/data

typescript

Query Parameters:
- uploadId?: number (filter by specific upload)
- site?: string
- startDate?: string
- endDate?: string
- metric?: string
- sampleRate?: number

Response:
{
  overviewStats: {...}
  timeSeriesData: [...]
  aggregatedStats: [...]
  // ... all existing dashboard data
}

5. Frontend Components

5.1 New Pages

1. Upload Page (/upload)

Components:
- UploadForm
  - FileDropzone
  - UploadTypeSelector (Daily/Weekly)
  - DatePicker / WeekPicker
  - SiteSelector
  - ValidationDisplay
- UploadProgress
- UploadHistory (recent uploads)

2. Reports Page (/reports)

Components:
- ReportCalendar (calendar view with highlights)
- ReportList (list view with filters)
- ReportCard (summary + actions)
- ReportViewer (inline PDF viewer)
- FilterPanel
  - Type Filter (Daily/Weekly)
  - Site Filter
  - Date Range Filter
  - Search Input

3. Report Detail Page (/reports/:id)

Components:
- ReportHeader (metadata)
- ReportStatistics
- ReportActions
  - View Button
  - Download Button
  - Delete Button
  - Re-generate Button
- DataSourceInfo (original CSV info)

5.2 Modified Pages

Dashboard (/dashboard)

Changes:
- Add "Upload Data" button → links to /upload
- Add "View Reports" button → links to /reports
- Add upload filter dropdown (view by upload ID)
- Add indicator showing current data source

Navigation

New Menu Items:
- Upload Data
- My Reports
- Dashboard
- Settings (new)

5.3 Shared Components

1. PDFViewer

typescript

// Wrapper around react-pdf or iframe
<PDFViewer
  url={pdfUrl}
  mode="inline" | "fullscreen"
  enableDownload={true}
  enablePrint={true}
/>

2. DataUploadDropzone

typescript

<DataUploadDropzone
  accept=".csv"
  maxSize={50 * 1024 * 1024} // 50MB
  onDrop={handleFileDrop}
  onValidate={validateCSV}
/>

3. PeriodSelector

typescript

<PeriodSelector
  type="daily" | "weekly"
  value={period}
  onChange={setPeriod}
/>

6. PDF Generation Architecture

6.1 Generation Pipeline

User Upload → Data Storage → Report Queue → Generator → PDF Storage

Step-by-Step Process:

Trigger: Upload completes or user clicks "Re-generate"
Queue: Add job to generation queue (for async processing)
Data Fetch: Query database for upload's measurements
Statistics: Calculate all statistics and metrics
Chart Data: Transform data for all visualizations
Render: Launch Puppeteer browser instance
HTML Generation: Render React report template with data
PDF Conversion: Puppeteer prints page to PDF
Storage: Save PDF to filesystem
Database Update: Create report record
Cleanup: Close browser, remove old version if exists
Notification: Return PDF URL to user

6.2 Report Template Component

typescript

// src/components/reports/ReportTemplate.tsx
interface ReportTemplateProps {
  data: {
    upload: Upload
    statistics: Statistics
    charts: {
      timeSeriesData: TimeSeriesData[]
      pmDistribution: PmDistributionData[]
      temperatureHumidity: ScatterData[]
      // ... more chart data
    }
  }
}

export function ReportTemplate({ data }: ReportTemplateProps) {
  return (
    <div className="report-container">
      <ReportHeader upload={data.upload} />
      <ExecutiveSummary stats={data.statistics} />
      <OverviewStatistics stats={data.statistics} />
      <DataVisualizations charts={data.charts} />
      <DetailedMetrics stats={data.statistics} />
      <Appendix upload={data.upload} />
    </div>
  )
}

6.3 PDF Generation Service

typescript

// server/services/pdfGenerator.ts
class PDFGenerator {
  async generateReport(uploadId: number): Promise<Report> {
    // 1. Fetch data
    const upload = await db.getUpload(uploadId)
    const measurements = await db.getMeasurements(uploadId)

    // 2. Calculate statistics
    const statistics = calculateStatistics(measurements)

    // 3. Transform chart data
    const charts = transformChartsData(measurements)

    // 4. Render HTML
    const html = await renderReportHTML({ upload, statistics, charts })

    // 5. Generate PDF
    const pdfBuffer = await this.htmlToPDF(html)

    // 6. Save PDF
    const pdfPath = await savePDF(pdfBuffer, upload)

    // 7. Create report record
    const report = await db.createReport({
      uploadId,
      pdfPath,
      fileSize: pdfBuffer.length,
      pageCount: await getPDFPageCount(pdfBuffer)
    })

    return report
  }

  private async htmlToPDF(html: string): Promise<Buffer> {
    const browser = await puppeteer.launch({ headless: true })
    const page = await browser.newPage()
    await page.setContent(html, { waitUntil: 'networkidle0' })
    const pdf = await page.pdf({
      format: 'A4',
      printBackground: true,
      margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' }
    })
    await browser.close()
    return Buffer.from(pdf)
  }
}

7. Data Management Features

7.1 Calendar View

Implementation:

typescript

// Monthly calendar with highlighted dates
<Calendar
  mode="month"
  highlightedDates={uploadsMap} // { '2025-01-15': [upload1, upload2] }
  onDateClick={(date) => showUploadsForDate(date)}
  renderDay={(date, uploads) => (
    <DayCell date={date}>
      {uploads.length > 0 && (
        <Badge count={uploads.length} />
      )}
    </DayCell>
  )}
/>

Weekly View:

typescript

// Week-based calendar
<WeekCalendar
  highlightedWeeks={weekUploads} // { '2025-W01': [upload1] }
  onWeekClick={(weekNumber) => showUploadsForWeek(weekNumber)}
/>

7.2 Bulk Operations

Features:

Select multiple reports
Bulk download (zip file)
Bulk delete (with confirmation)
Bulk re-generate

Filter Options:

Upload type (daily/weekly)
Date range
Site
Device
Status (completed/failed)
Report availability

Search:

Full-text search on site name, device ID
Date search (natural language: "last week", "January 2025")

8. Error Handling & Validation

8.1 CSV Validation

Pre-Upload Validation:

typescript

interface ValidationResult {
  isValid: boolean
  errors: ValidationError[]
  warnings: ValidationWarning[]
  preview: MeasurementRecord[]
}

async function validateCSV(file: File): Promise<ValidationResult> {
  // 1. Check file extension
  // 2. Check file size
  // 3. Parse first 10 rows
  // 4. Validate headers
  // 5. Validate data types
  // 6. Check for required columns
  // 7. Detect date format
  // 8. Check for duplicates
  return result
}

Validation Rules:

Required columns present
No empty rows
Valid date formats
Numeric values in range
No future dates
Consistent site/device naming

8.2 Upload Error Handling

Error Types:

File Format Error: Invalid CSV structure
Data Validation Error: Invalid values or missing data
Duplicate Error: Data already exists for this period
Storage Error: Disk space or permissions issue
Database Error: Insert/update failures
PDF Generation Error: Rendering failures

User Feedback:

typescript

<UploadErrorDisplay
  error={error}
  onRetry={() => retryUpload()}
  onEditData={() => showDataEditor()}
  onContactSupport={() => openSupportDialog()}
/>

8.3 Report Generation Error Handling

Retry Logic:

3 automatic retries with exponential backoff
Store failed generation attempts in database
User notification for persistent failures
Manual retry option in UI

Fallback Strategies:

If chart rendering fails → use text statistics
If full report fails → generate summary PDF
Store error logs for debugging

9. Performance Considerations

9.1 Upload Performance

Optimizations:

Stream CSV parsing (avoid loading entire file in memory)
Batch database inserts (1000 rows at a time)
Background processing for large files
Progress feedback to user
Compression for stored CSVs

9.2 PDF Generation Performance

Optimizations:

Queue-based generation (avoid blocking uploads)
Puppeteer instance pooling
Cache rendered charts
Lazy loading of report data
Pre-generate common reports

Scalability:

Maximum 5 concurrent PDF generations
Queue system for peak loads
Timeout: 2 minutes per report
Resource monitoring and limits

9.3 Database Performance

Indexes:

sql

CREATE INDEX idx_measurements_upload_time ON measurements(upload_id, time);
CREATE INDEX idx_measurements_site_time ON measurements(site, time);
CREATE INDEX idx_uploads_period ON uploads(period_start, period_end);
CREATE INDEX idx_reports_upload ON reports(upload_id);

Query Optimization:

Use prepared statements
Limit result sets with pagination
Aggregate queries with proper indexing
Cache frequently accessed statistics

10. Security Considerations

10.1 Upload Security

File Validation:

Whitelist: Only .csv files
Size limit: 50MB maximum
Virus scanning (optional: ClamAV)
Content validation (no executable content)

Input Sanitization:

Escape all user inputs
Validate date ranges
Sanitize filenames (no path traversal)
Limit upload frequency (rate limiting)

10.2 API Security

Authentication:

JWT-based authentication (future)
API key for programmatic access
Session management

Authorization:

Users can only access their own data
Role-based access (admin/user)
Audit logging for sensitive operations

Protection:

CORS configuration
Rate limiting (express-rate-limit)
Request size limits
SQL injection prevention (parameterized queries)

10.3 File Storage Security

Access Control:

Files stored outside web root
No direct file access via URL
Serve files through API with auth checks
Temporary signed URLs for downloads

11. User Experience Enhancements

11.1 Real-Time Feedback

Upload Progress:

typescript

<UploadProgress
  stage="parsing" | "validating" | "importing" | "generating"
  progress={75}
  currentOperation="Importing row 7500/10000"
/>

Notifications:

Upload started
Upload completed
Report ready
Errors occurred

11.2 Data Preview

Before Upload:

Show first 10 rows of CSV
Display detected site/device
Show date range
Highlight any warnings

After Upload:

Quick statistics
Link to full report
Link to dashboard filtered to this data

11.3 Report Preview

Thumbnail Generation:

Generate PNG thumbnail of first page
Display in report list
Quick visual identification

12. Future Enhancements

12.1 Phase 2 Features

Email Reports: Send PDF via email on generation
Scheduled Reports: Generate reports automatically
Comparison Reports: Compare multiple periods
Custom Templates: User-defined report layouts
Export Formats: Excel, JSON, XML exports
API Access: RESTful API for external integrations

12.2 Phase 3 Features

User Authentication: Multi-user support
Team Collaboration: Share reports and data
Advanced Analytics: ML-based insights
Mobile App: iOS/Android clients
Cloud Storage: S3/GCS integration
Real-Time Data: WebSocket for live updates

13. Testing Strategy

13.1 Unit Tests

Coverage Areas:

CSV parsing and validation
Data transformations
Statistics calculations
Chart data transformers
API endpoint logic

Tools: Vitest + Testing Library

13.2 Integration Tests

Scenarios:

Complete upload flow
PDF generation pipeline
Report retrieval and download
Data deletion cascade

Tools: Vitest + Supertest

13.3 E2E Tests

User Flows:

Upload daily data → view report
Upload weekly data → view report
Browse reports calendar → download
Delete upload → verify cleanup
Re-generate report → verify update

Tools: Playwright

14. Deployment Architecture

14.1 Development Environment

Docker Compose:
- Node.js API server (port 3000)
- Vite dev server (port 5173)
- SQLite database (volume mount)
- File storage (volume mount)

14.2 Production Environment

Option 1: Single Server
- Node.js + Express (API + static serving)
- SQLite database
- Local file storage
- Nginx reverse proxy

Option 2: Containerized
- Docker container for API
- Volume mounts for database + files
- Horizontal scaling with shared storage

15. Success Metrics

15.1 Performance Metrics

Upload processing time: < 30 seconds for 10k rows
PDF generation time: < 60 seconds
Report load time: < 2 seconds
Dashboard query time: < 1 second

15.2 User Metrics

Upload success rate: > 95%
Report generation success rate: > 98%
User satisfaction score: > 4.5/5
Daily active users: tracking
Average uploads per user: tracking

16. Extended Features (Phase 5+)

16.1 Dust Levels Module

Location: /src/features/dust-levels/

Capabilities:

Real-time dust level data visualization from external scraper (Dustac)
Multiple chart types:
- Dust Level & Temperature correlation charts
- Daily/Weekly/Monthly average dust levels
- Spike dust analysis (work area and site-wide)
- Dustloc water usage tracking
- Multi-site comparison charts
- Dust distribution analysis with statistical insights
PDF export functionality with:
- Portrait/Landscape orientation toggle
- Multiple chart selection for export
- Professional report layout with logos
- AI-generated chart descriptions
Data aggregation services for daily/weekly/monthly summaries
Compare mode for multi-site data analysis

Key Services:

dataAggregationService.ts - Aggregates measurements into summaries
databaseService.ts - Queries dust level data
exportService.ts - Generates PDF reports with charts
compareExportService.ts - Multi-site comparison exports
scraperService.ts - Triggers external dust level scraper

Routes:

/dust-levels - Main dust levels dashboard
/dust-levels/compare - Multi-site comparison view

16.2 Weekly Reports Module

Location: /src/features/weekly-reports/

Capabilities:

Structured weekly field report creation with auto-save
8 report sections:
1. Site Observations - Device status, issues, actions taken
2. Flow Meter Usage - Water consumption tracking
3. Dashboard Updates - Platform changes and issues
4. Vendor Activities - Connect Source and vendor work logs
5. Water Truck Testing - Hardware testing summaries
6. Hardware & Installations - Equipment tracking
7. Admin & Reporting - Travel, logistics, record-keeping
8. Other Tasks - Site support, stakeholder engagement
Draft/Completed status management
Auto-save with visual indicators
Site name management
Nested form validation with React Hook Form + Zod

Key Service:

weeklyReportService.ts - CRUD operations with auto-save, data validation

Database Table:

sql

CREATE TABLE weekly_reports (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  week_start DATE NOT NULL,
  week_end DATE NOT NULL,
  status TEXT DEFAULT 'draft',
  site_observations JSONB,
  flow_meter_usage JSONB,
  dashboard_updates JSONB,
  vendor_activities JSONB,
  water_truck_testing JSONB,
  hardware_installations JSONB,
  admin_reporting JSONB,
  other_tasks JSONB,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);

16.3 Report Templates Module

Location: /src/features/report-templates/

Capabilities:

Create custom markdown-based report templates
Set templates as default for quick access
Template management (create, edit, delete)
Pre-populated default template for mine site reporting
Mine-sites management integration

Routes:

/report-template - List and manage templates
/report-template/create - Create new template
/report-template/edit/:id - Edit existing template
/report-template/mine-sites - Manage mine site list

Database Table:

sql

CREATE TABLE report_templates (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  name TEXT NOT NULL,
  content TEXT NOT NULL,
  is_default BOOLEAN DEFAULT false,
  created_at TIMESTAMPTZ DEFAULT NOW(),
  updated_at TIMESTAMPTZ DEFAULT NOW()
);

16.4 Flow Meter Module

Location: /src/features/flow-meter/

Capabilities:

Water usage tracking across multiple sites
Refill management with add/edit/delete functionality
Advanced scraper configuration with presets and custom date ranges
Site auto-detection from actual data
Data aggregation with daily summaries

Scraper Configuration Options:

maxPages (1-100+)
delayBetweenRequests (0-10 seconds)
delayBetweenSites (0-30 seconds)
Time range presets: AllTime, Today, Yesterday, ThisWeek, LastWeek, ThisMonth, LastMonth, ThisQuarter, etc.

Key Components:

ScraperConfigModal.tsx - Configurable scraper parameters
DateRangeSelector.tsx - Date range selection with presets
AddRefillModal.tsx / EditRefillModal.tsx - Refill management
SiteSummaryCard.tsx - Site statistics display

16.5 Climate Module (Framework)

Location: /src/features/climate/

Status: Framework in place, pending full implementation

Planned Capabilities:

Weather station data integration
Temperature, rainfall, humidity tracking
Wind speed and pressure monitoring
Historical weather data analysis

Database Tables:

sql

CREATE TABLE weather_stations (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  station_name TEXT NOT NULL,
  location TEXT,
  latitude NUMERIC,
  longitude NUMERIC,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE weather_data (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  station_id UUID REFERENCES weather_stations(id),
  user_id UUID NOT NULL REFERENCES auth.users(id),
  timestamp TIMESTAMPTZ NOT NULL,
  temperature NUMERIC,
  rainfall NUMERIC,
  humidity NUMERIC,
  wind_speed NUMERIC,
  wind_direction TEXT,
  pressure NUMERIC
);

Routes:

/climate - Main climate dashboard
/climate/stations - Weather station management

17. Edge Functions (Supabase)

17.1 AI Chart Descriptions

Function: generate-chart-descriptions

Purpose: Generate AI-powered descriptions for dust level charts using Google Gemini API.

Supported Chart Types:

Dust Levels with Temperature
Daily/Weekly/Monthly Average Dust Levels
Spike Dust Data (work area and site-wide)
Dustloc Usage
Multi-site comparison charts
Comparison tables

Integration:

typescript

// Called from ExportModal.tsx during PDF generation
const response = await supabase.functions.invoke('generate-chart-descriptions', {
  body: {
    chartType: 'dustLevelsTemp',
    chartData: [...],
    siteName: 'Boddington',
    dateRange: { start: '2025-01-01', end: '2025-01-31' }
  }
});

Configuration:

Model: Gemini Pro
Temperature: 0.7
Max tokens: 500
Professional mining industry terminology

17.2 Dust Level Scraper Trigger

Function: trigger-dust-level-scraper

Purpose: Trigger external Dustac API to fetch latest dust level data.

External API: http://207.211.150.159:8000/api/external/pipelines/dust_level_scraper/run

Parameters:

Time range selection (preset or custom dates)
JWT token authentication

Usage:

typescript

await supabase.functions.invoke('trigger-dust-level-scraper', {
  body: {
    timeRange: 'last_week',
    // or custom: { startDate: '2025-01-01', endDate: '2025-01-31' }
  }
});

17.3 Flow Meter Scraper Trigger

Function: trigger-flow-meter-scraper

Purpose: Trigger external API to fetch flow meter data with configurable parameters.

Parameters:

maxPages - Number of pages to scrape (1-100+)
delayBetweenRequests - Delay in seconds (0-10)
delayBetweenSites - Delay between sites (0-30)
timeRange - Preset or custom date range

Time Range Presets:

AllTime, Today, Yesterday
ThisWeek, LastWeek
ThisMonth, LastMonth
ThisQuarter, LastQuarter
ThisYear, LastYear

18. Database Schema Additions (Phase 5+)

New Tables Summary

Table	Purpose
`weekly_reports`	Structured weekly field reports
`report_templates`	User report templates
`weather_stations`	Climate data sources
`weather_data`	Historical weather measurements
`flow_meter_data`	Water usage tracking
`flow_meter_refills`	Refill records

All tables include:

UUID primary keys
User ID foreign keys with RLS policies
Created/Updated timestamps
Appropriate indexes for query performance

Summary

This system design transforms the current dashboard into a comprehensive data management platform with:

✅ Upload System: Daily/weekly CSV uploads with validation ✅ Report Generation: Automated PDF reports with charts ✅ Data Management: Calendar view, search, filter, delete ✅ Overwrite Logic: Automatic replacement of existing reports ✅ User Interface: Intuitive upload and browsing experience ✅ Performance: Optimized for large datasets and concurrent users ✅ Security: Input validation, file protection, API security ✅ Scalability: Queue-based processing, modular architecture

Extended Features (Phase 5+):

✅ Dust Levels: Real-time monitoring with multi-chart visualization and PDF export ✅ Weekly Reports: Structured field reporting with auto-save functionality ✅ Flow Meter: Water usage tracking with scraper integration ✅ Report Templates: Customizable markdown-based templates ✅ AI Integration: Gemini-powered chart descriptions for reports ✅ External Scrapers: Automated data collection from Dustac and flow meter systems ✅ Climate Framework: Weather station integration (pending full implementation) ✅ Mine Sites: Location and asset management

Next Steps: See IMPLEMENTATION_PLAN.md for phased development approach.

System Design: Data Upload & PDF Report Generation System ​

Executive Summary ​

Feature Modules Overview ​

1. System Architecture ​

1.1 High-Level Architecture (Supabase-Based) ​

1.2 Technology Stack Changes ​

1.3 Authentication & User Management (Supabase Auth) ​

Overview ​

Authentication Flow with Supabase ​

Supabase Auth Features Used ​

User Roles Implementation ​

Row-Level Security (RLS) Policies ​

Session Management ​

2. Core Features & User Flows ​

2.1 Feature: Data Upload (ZIP with Site Folder Structure) ​

2.2 Feature: Report Management ​

2.3 Feature: PDF Report Generation ​

3. Data Model (PostgreSQL with Supabase) ​

3.1 Supabase Auth Tables (Built-in) ​

3.2 Custom Database Tables ​

3.3 Supabase Storage Structure ​

4. API Design ​

4.1 Upload Endpoints ​

4.2 Report Endpoints ​

4.3 Dashboard Endpoints (Modified) ​

5. Frontend Components ​

5.1 New Pages ​

5.2 Modified Pages ​

5.3 Shared Components ​

6. PDF Generation Architecture ​

6.1 Generation Pipeline ​

6.2 Report Template Component ​

6.3 PDF Generation Service ​

7. Data Management Features ​

7.1 Calendar View ​

7.2 Bulk Operations ​

7.3 Search & Filter ​

8. Error Handling & Validation ​

8.1 CSV Validation ​

8.2 Upload Error Handling ​

8.3 Report Generation Error Handling ​

9. Performance Considerations ​

9.1 Upload Performance ​

9.2 PDF Generation Performance ​

9.3 Database Performance ​

10. Security Considerations ​

10.1 Upload Security ​

10.2 API Security ​

10.3 File Storage Security ​

11. User Experience Enhancements ​

11.1 Real-Time Feedback ​

11.2 Data Preview ​

11.3 Report Preview ​

12. Future Enhancements ​

12.1 Phase 2 Features ​

12.2 Phase 3 Features ​

13. Testing Strategy ​

13.1 Unit Tests ​

13.2 Integration Tests ​

13.3 E2E Tests ​

14. Deployment Architecture ​

14.1 Development Environment ​

14.2 Production Environment ​

15. Success Metrics ​

15.1 Performance Metrics ​

15.2 User Metrics ​

16. Extended Features (Phase 5+) ​

16.1 Dust Levels Module ​

16.2 Weekly Reports Module ​

16.3 Report Templates Module ​

16.4 Flow Meter Module ​

16.5 Climate Module (Framework) ​

17. Edge Functions (Supabase) ​

17.1 AI Chart Descriptions ​

17.2 Dust Level Scraper Trigger ​

17.3 Flow Meter Scraper Trigger ​

18. Database Schema Additions (Phase 5+) ​

New Tables Summary ​

Summary ​

System Design: Data Upload & PDF Report Generation System

Executive Summary

Feature Modules Overview

1. System Architecture

1.1 High-Level Architecture (Supabase-Based)

1.2 Technology Stack Changes

1.3 Authentication & User Management (Supabase Auth)

Overview

Authentication Flow with Supabase

Supabase Auth Features Used

User Roles Implementation

Row-Level Security (RLS) Policies

Session Management

2. Core Features & User Flows

2.1 Feature: Data Upload (ZIP with Site Folder Structure)

2.2 Feature: Report Management

2.3 Feature: PDF Report Generation

3. Data Model (PostgreSQL with Supabase)

3.1 Supabase Auth Tables (Built-in)

3.2 Custom Database Tables

3.3 Supabase Storage Structure

4. API Design

4.1 Upload Endpoints

4.2 Report Endpoints

4.3 Dashboard Endpoints (Modified)

5. Frontend Components

5.1 New Pages

5.2 Modified Pages

5.3 Shared Components

6. PDF Generation Architecture

6.1 Generation Pipeline

6.2 Report Template Component

6.3 PDF Generation Service

7. Data Management Features

7.1 Calendar View

7.2 Bulk Operations

7.3 Search & Filter

8. Error Handling & Validation

8.1 CSV Validation

8.2 Upload Error Handling

8.3 Report Generation Error Handling

9. Performance Considerations

9.1 Upload Performance

9.2 PDF Generation Performance

9.3 Database Performance

10. Security Considerations

10.1 Upload Security

10.2 API Security

10.3 File Storage Security

11. User Experience Enhancements

11.1 Real-Time Feedback

11.2 Data Preview

11.3 Report Preview

12. Future Enhancements

12.1 Phase 2 Features

12.2 Phase 3 Features

13. Testing Strategy

13.1 Unit Tests

13.2 Integration Tests

13.3 E2E Tests

14. Deployment Architecture

14.1 Development Environment

14.2 Production Environment

15. Success Metrics

15.1 Performance Metrics

15.2 User Metrics

16. Extended Features (Phase 5+)

16.1 Dust Levels Module

16.2 Weekly Reports Module

16.3 Report Templates Module

16.4 Flow Meter Module

16.5 Climate Module (Framework)

17. Edge Functions (Supabase)

17.1 AI Chart Descriptions

17.2 Dust Level Scraper Trigger

17.3 Flow Meter Scraper Trigger

18. Database Schema Additions (Phase 5+)

New Tables Summary

Summary