Built for desks running their own models
The Daily Signal API is the right product for systems that consume signals row-by-row, CRMs, procurement gates, vendor-risk dashboards. Bulk datasets are the right product for desks that consume the population: hedge-fund quant teams running factor backtests, M&A origination desks pulling target lists, alt-data vendors blending UK iXBRL into multi-region products.
If you’ve ever scraped Companies House yourself, this is the product that pays for itself in the first week.
What you can export
- The full UK iXBRL population — every active filer, every tag, every business day. Refreshed nightly at 04:00 GMT.
- The 41-check stability taxonomy — every signal event, every entity, with provenance back to the originating filing reference.
- The SIC peer-set distributions — pre-computed margins, current ratios, staff-cost ratios at percentile-resolution against the live population. No need to recompute the comparables yourself.
- The five-year point-in-time archive — every signal event from 2021 onward with no look-ahead bias. Critical for desks that need to defend their backtests under research compliance.
Cuts can be filtered by SIC code, region (postcode-level), turnover band, signal family, or any combination thereof. We’ve shipped buy-box exports as narrow as ~250 rows and as broad as the full ~5M-entity active register.
Format options
We ship the same data three ways depending on how your stack consumes it:
- CSV — for desks that pipe straight into Excel, Power BI, Tableau or Looker. UTF-8 BOM included for Excel compatibility on Windows.
- Parquet — for quant desks running pandas, Polars, Spark or DuckDB. Schema includes column-level statistics so partition-pruning works out of the box.
- Direct SQL — for desks that want a managed Postgres, BigQuery or Snowflake table they can
JOINagainst. Refreshed nightly via Fivetran-compatible schemas, or pushed direct to a Snowflake share for the low-latency tier.
The schema is identical across formats. Pick whichever lands best in your pipeline.
Customisation
The default cut works for most desks, but bespoke exports are the norm. The most common customisations:
- Custom factor builds — combine multiple signal events into a single derived field, computed daily across the population. Examples already in production include “Zombie + Threshold Crossing within 12 months” and “Staff-cost cliff + Director-resignation cluster”.
- Cross-reference enrichment — overlay your own watchlist of company numbers with full population context (peer-set rank, regional benchmark, SIC peer median).
- Custom backfill windows — five-year point-in-time is the default; ten years is available for desks running longer-horizon factor backtests.
- Restated-filing handling — if a company restates its accounts, the archive can be configured either to replace the original record (clean) or preserve both (audit-grade). Defaults to clean for backtests, audit-grade for compliance use cases.
Every customisation is delivered against a written spec we agree up front, with sample rows for sign-off before the full cut ships. No surprise schemas.
How desks typically use it
A short tour of how the bulk product gets used in the wild:
- Quant funds ingest the daily Parquet, blend it with their existing UK alt-data factors, and run nightly backtests against the five-year archive. The Real-Time Margin Shift and Ghost-Tech factors have produced clean backtest performance across the cycle, particularly in the lower-mid revenue band.
- M&A origination pulls a weekly CSV of new £10.2M-threshold crossings, Zombie candidates and top-5%-margin entities for a target SIC. Replaces broker subscriptions for sourcing in the £1M-£50M turnover band.
- Insurance underwriting ingests the SIC peer-set distributions into their pricing models to risk-adjust UK private-company premia at quote-time. Dataset is consumed via Snowflake share.
- Alt-data vendors re-license the UK cut alongside their existing US and EU coverage. We do not white-label, but we do allow embedded redistribution under a commercial agreement.
Pricing
Pricing scales by entity count and refresh cadence, not by seat count. A typical engagement looks like:
- Pilot bulk cut: from £4,500 / quarter, a quarterly CSV / Parquet drop, capped at 50,000 entities.
- Daily refresh: from £2,400 / month, a daily Parquet drop with a 12-month rolling window.
- Five-year point-in-time backfill: from £18,000 one-off, full archive licence for backtesting under research-compliance.
- Enterprise / Snowflake share: custom, direct Snowflake share or BigQuery transfer, dedicated infrastructure, named engineer.
All tiers include the same iXBRL filing-reference provenance. Bulk delivery is the only thing that changes.