8 The OpenWebUI Action MVP

8.1 Overview

Our v1 MVP is implemented as an OpenWebUI Action that enables opt-in data contribution directly from the chat interface to a HuggingFace dataset repository. This approach leverages OpenWebUI’s existing infrastructure and user accounts, eliminating the need for a separate flywheel website.

The goal is to obtain the benefits of using a source control backend (in this case, git on HuggingFace) but try to mitigate the additional friction / UX challenges associated with using git or other “high-effort” source control approaches.

8.2 Architecture

The flywheel consists of:

Frontend: OpenWebUI instance at https://chat.publicai.co/
Action Plugin: Python-based OpenWebUI action that handles contributions
User Settings: OpenWebUI’s “User Valves” for persistent preferences
Data Storage: Private HuggingFace dataset repository for staging / quaratine. Public but user agreement gated HuggingFace dataset repository for approved data.
Processing Pipeline: Asynchronous scripts that process the waiting room
Static site: Static site with anti-scraping for direct download.

8.3 How Contribution Works

8.3.1 User Setup (One-time)

User creates an OpenWebUI account
User opens Controls / Valves / Functions
User toggles Sharing Enabled to ON
User selects:
- License: CC0-1.0, CC-BY-4.0, or CC-BY-SA-4.0
- AI Preference Signal: IETF/CC preference (e.g., train-genai=n;exceptions=cc-cr)
- Pseudonym: Use username, anonymous, or custom name
- Auto-feedback: Whether to skip feedback prompts

8.3.2 Contributing a Chat

User has a conversation with any model
User triggers the “Public AI Data Flywheel” action
System shows current settings and asks for confirmation
Optional: User provides more feedback or context
Action creates a contribution JSON with:
- Conversation messages
- Metadata (model, tokens, timestamp)
- User’s license and AI preference selections
- Attribution (based on pseudonym setting)
- Contributor hash (anonymized ID)
Contribution uploads to _waiting_room/ in HuggingFace repo

8.3.3 Data Processing Pipeline

Waiting Room: Contributions land in _waiting_room/ directory
Validation: Daily script processes pending files
PII Redaction: Automated check for emails, SSNs, phone numbers, etc.
Quarantine: Files with PII hits or errors go to _quarantined/
Release: Clean contributions move to “ready” directory
Distribution: Published via gated HF repo and public gallery

8.4 Key Features

8.4.1 Privacy & Attribution

Contributor Hash: SHA256 hash of salt + "openwebui:" + user_id (16 chars)
Pseudonymity: Users choose between username, anonymous, or custom pseudonym
Avoid unintended PII in final dataset: Some automated redaction before release

8.4.2 User Control

Opt-in only: Requires explicit enabling in settings
Per-contribution consent: Confirmation before each share
Persistent preferences: Settings saved in User Valves
License selection: Per-user default (not per-contribution in v1)

8.4.3 Safety Features

Mock mode: Test contributions without actual upload
PII detection: Email, IP, SSN, IBAN, crypto wallets, phone, credit cards
Quarantine system: Content can be held for review
Rate limiting: Handled by OpenWebUI’s existing limits