Find Alpha Where the Market Breaks

If you’re building ML models for algorithmic trading, you already know the problem:

High-frequency market data is noisy, expensive to clean, and painfully slow to label.

The MagSeven High-Freq Anomaly Dataset skips all of that.

Instead of raw OHLCV files, you get a precision-engineered collection of real market stress events β€” moments where volatility and liquidity spike together, and models actually learn something meaningful.

We scan minute-level market data (1m and 5m aggregation) for the Magnificent Seven tech stocks and extract only statistically significant volatility + volume shock events, ready for immediate use.

This is not a general-purpose price dataset. It is a feature-engineered anomaly corpus built specifically for ML and quantitative research.

What You Get

200+ High-Quality Anomaly Events from:

Strict Dual-Factor Detection Logic:

ML-Ready Feature Engineering:

Forward-Looking Outcome Labels Each event includes:

Multi-Scale Context Windows

Clean JSONL Format Stream directly into Python, Pandas, PyTorch, or TensorFlow dataloaders β€” no preprocessing required.


πŸ’‘ Common Use Cases


πŸš€ Start Modeling Immediately

Get the full dataset with Forward-Looking Outcome Labels and Multi-Scale Context Windows.

Buy on Gumroad ($49)

Or preview the data structure:

View Sample on Hugging Face

πŸ“¬ Contact & Support

If you have any questions about this dataset, licensing, or access to the full version, feel free to reach out:

πŸ“§ Email: [email protected]

Please note that this email is intended for dataset-related inquiries only. We aim to respond within 1–2 business days.