Build a Cybersecurity Threat Intelligence Feed for SaaS

Kent Hudson

Kent Hudson

·

17 mins read

Build a Cybersecurity Threat Intelligence Feed for SaaS

Build a Cybersecurity Threat Intelligence Feed for SaaS

A cybersecurity threat intelligence feed is a continuous, filtered stream of security signals — breaches, CVEs, exploited vulnerabilities, and vendor incidents — that is normalized into a single channel, scored for relevance to your stack, and routed to the people who can act on it. Most articles on this topic tell you to subscribe to 12 commercial feeds. For a small or mid-sized SaaS, that's 12× the noise — and a $20k–100k/yr bill — for a problem that's solvable with one well-built aggregator under $50/mo.

This is for SaaS engineering and security teams that want signal, not noise: alerts about breaches in your dependencies (Postgres, Stripe, Cloudflare, Auth0), CVEs in libraries you ship, and incidents at your customers' brands. (Disclosure: APITube is the news API used in the example, and the sponsor of this article. The pattern works with NewsAPI, GNews, NewsCatcher; CISA KEV and the NVD CVE feed are free.)

Key takeaways:

  1. The 80/20 stack is CISA KEV + NVD CVE feed + a structured news API. You don't need MISP, you don't need a SIEM module, you don't need a $30k/yr commercial feed.
  2. Score every signal along four axes — KEV match, CVSS severity, vendor-in-your-stack, recency. Anything below threshold goes to a digest, not a page.
  3. Filter by your dependencies. A breach at Stripe matters; a breach at a CRM you don't use is noise.
  4. Route by severity: P0 → PagerDuty, P1 → Slack #sec-alerts, P2+ → daily digest email.
  5. A working aggregator is ~150 lines of Python + a tiny SQLite store. Total monthly cost: ~$30 (news API tier) + ~$5 (Hetzner box).

The problem

A typical SaaS team has the same threat-intel problem in three flavors:

  • Engineering: "is the Postgres CVE that just hit news in our minor version?"
  • Customer success: "one of our top 50 customers is in the news for a breach — should we reach out?"
  • Compliance: "did anything happen to our subprocessors today that I need to log?"

What's on offer:

  1. Subscribe to 12 free feeds + RSS them into Feedly. You get firehose-volume noise, no severity scoring, no filtering by stack, no routing.
  2. Buy a commercial threat-intel platform ($10k–100k/yr — Recorded Future, Mandiant, Anomali). Overkill for a 30-engineer SaaS, and the relevance signal is still generic.
  3. Build the small thing yourself. This article walks through that path.

The gap in the public discussion is that nobody writes this third option down. Top-10 SERP results are vendor listicles ("12 free feeds to follow") and SEO-optimized "what is a TI feed" pages, none of which gets near a working architecture.

Solution overview

You're building a small daemon that does five things on a 30-minute cron:

  1. Pulls fresh items from three sources: a news API (sentiment + entities + topics), the CISA Known Exploited Vulnerabilities (KEV) catalog, and the NVD CVE feed.
  2. Normalizes each item into a common ThreatSignal schema.
  3. Dedupes against an SQLite store of seen IDs.
  4. Scores each signal against your stack and a relevance threshold.
  5. Routes by severity: PagerDuty for P0, Slack for P1, weekly digest for P2.

Unlike a commercial platform, which sells you a generic feed plus dashboards you'll never open, this aggregator only surfaces what intersects with your vendor list and your customer list — which means signal-to-noise ratio is set by your config file, not by a vendor's idea of "important."

Architecture

                     ┌────────────────┐
                     │  CISA KEV JSON │──┐
                     └────────────────┘  │
                     ┌────────────────┐  │
                     │  NVD CVE feed  │──┤
                     └────────────────┘  │
                     ┌────────────────┐  │     ┌──────────────┐    ┌──────────┐
                     │  APITube news  │──┼────▶│  Normalizer  │───▶│  SQLite  │
                     │  topic:cyber   │  │     │  + Dedup     │    │  (seen)  │
                     └────────────────┘  │     └──────┬───────┘    └──────────┘
                                         │            │
                                         │            ▼
                                         │     ┌──────────────┐
                                         │     │   Scorer     │
                                         │     │ KEV+CVSS+    │
                                         │     │ stack+recent │
                                         │     └──────┬───────┘
                                         │            │
                                         │            ▼
                                         │     ┌──────────────┐
                                         │     │   Router     │
                                         │     └──────┬───────┘
                                         │            │
                                         │      ┌─────┴─────┬───────────────┐
                                         │      ▼           ▼               ▼
                                         │  PagerDuty    Slack       Daily digest
                                         │   (P0)       (P1)         (P2 / weekly)
                                         │
                                       cron 30m

Three data sources, one normalizer, one persistent dedup store, one scorer, three outputs. Total moving parts: small enough to fit in your head.

Implementation

Stack config (the part that makes it SaaS-specific)

# stack.yml
vendors:
  critical:                # P0 if breach
    - postgres
    - postgresql
    - stripe
    - cloudflare
    - auth0
    - okta
  important:               # P1 if breach
    - sendgrid
    - twilio
    - datadog
    - segment
  customers:               # match by entity name (case-insensitive)
    - "Acme Corp"
    - "Globex"
    - "Initech"
libraries:                 # CVE matching by package name
    - django
    - psycopg
    - requests
    - jwt
keywords:                  # plain-text fallback for news scan
    - "supply chain attack"
    - "credential stuffing"

This file is the whole reason your aggregator beats a commercial feed for SaaS use — it knows what you care about.

Sources

# sources.py
import os, requests
from datetime import datetime, timezone

def fetch_kev() -> list[dict]:
    r = requests.get("https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json", timeout=15)
    r.raise_for_status()
    return r.json().get("vulnerabilities", [])

def fetch_cves(modified_since: str) -> list[dict]:
    r = requests.get("https://services.nvd.nist.gov/rest/json/cves/2.0",
                     params={"lastModStartDate": modified_since,
                             "lastModEndDate": datetime.now(timezone.utc).isoformat()},
                     timeout=20)
    r.raise_for_status()
    return r.json().get("vulnerabilities", [])

def fetch_news_cyber() -> list[dict]:
    r = requests.get("https://api.apitube.io/v1/news/everything",
                     params={"topic.id": "cybersecurity",
                             "language.code": "en",
                             "per_page": 50,
                             "sort": "published_at:desc"},
                     headers={"X-API-Key": os.environ["APITUBE_API_KEY"]},
                     timeout=15)
    r.raise_for_status()
    return r.json().get("results", [])

The news API call is the magic ingredient. APITube returns each article with entities (people, organizations, brands), topics, and sentiment already attached — so you can filter for entity.name in stack.critical server-side instead of running entity extraction yourself.

Normalizer

# normalize.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import Literal

Kind = Literal["news", "kev", "cve"]

@dataclass
class ThreatSignal:
    id: str                     # stable dedup key
    kind: Kind
    title: str
    url: str
    summary: str
    published_at: datetime
    entities: list[str] = field(default_factory=list)
    cvss: float | None = None
    cve_ids: list[str] = field(default_factory=list)
    raw: dict = field(default_factory=dict)

Three small adapter functions (from_news, from_kev, from_cve) each return a ThreatSignal. Keeping the shape uniform is what lets one scorer handle all three sources.

Dedup

# dedup.py
import sqlite3
db = sqlite3.connect("seen.db")
db.execute("CREATE TABLE IF NOT EXISTS seen (id TEXT PRIMARY KEY, ts INTEGER)")

def is_new(id_: str) -> bool:
    return db.execute("SELECT 1 FROM seen WHERE id=?", (id_,)).fetchone() is None

def mark(id_: str) -> None:
    db.execute("INSERT OR IGNORE INTO seen VALUES (?, strftime('%s','now'))", (id_,))
    db.commit()

For the news source, id is the article URL. For KEV, it's the CVE ID. For CVE, it's <cve_id>:<lastModified> so a re-scored CVE re-triggers exactly once.

The scorer (the actual interesting part)

# score.py
import yaml
from datetime import datetime, timezone

cfg = yaml.safe_load(open("stack.yml"))
CRIT = {v.lower() for v in cfg["vendors"]["critical"]}
IMP  = {v.lower() for v in cfg["vendors"]["important"]}
CUST = {c.lower() for c in cfg["vendors"]["customers"]}
LIBS = {l.lower() for l in cfg["libraries"]}

def severity(sig) -> int:
    """Return 0=ignore, 1=digest, 2=slack, 3=pagerduty"""
    text = (sig.title + " " + sig.summary).lower()
    ents = {e.lower() for e in sig.entities}

    if sig.kind == "kev":
        affected = (sig.raw.get("vendorProject", "") + " " + sig.raw.get("product", "")).lower()
        if any(v in affected for v in CRIT): return 3   # critical KEV → PagerDuty
        if any(v in affected for v in IMP):  return 2
        return 1

    if sig.kind == "cve":
        cvss = sig.cvss or 0
        prod = " ".join(sig.entities).lower()
        if any(lib in prod for lib in LIBS) and cvss >= 7.0:  return 3
        if cvss >= 9.0: return 2
        if cvss >= 7.0: return 1
        return 0

    if sig.kind == "news":
        age_hours = (datetime.now(timezone.utc) - sig.published_at).total_seconds() / 3600
        recency = max(0, 1 - age_hours / 48)              # 0..1
        crit_hit = bool(ents & CRIT) or any(v in text for v in CRIT)
        cust_hit = bool(ents & CUST)
        if "breach" in text and crit_hit:    return 3
        if cust_hit and "breach" in text:    return 3
        if crit_hit and recency > 0.3:       return 2
        if cust_hit:                          return 2
        if any(v in text for v in cfg["keywords"]):  return 1
        return 0

This is opinionated on purpose. KEV match against a critical vendor goes to PagerDuty because if CISA flagged it as actively exploited and you ship that vendor, you have a same-day decision to make. A high-CVSS CVE in a library you don't depend on returns 0 — irrelevance is half the value of a TI feed.

Router

# route.py
import os, requests, json

def to_pagerduty(sig):
    requests.post("https://events.pagerduty.com/v2/enqueue", timeout=10, json={
        "routing_key": os.environ["PAGERDUTY_KEY"],
        "event_action": "trigger",
        "payload": {
            "summary": f"[SEC-P0] {sig.title}",
            "severity": "critical",
            "source": "ti-feed",
            "custom_details": {"url": sig.url, "kind": sig.kind, "cves": sig.cve_ids},
        },
    })

def to_slack(sig, channel="#sec-alerts"):
    requests.post(os.environ["SLACK_WEBHOOK_URL"], timeout=10, json={
        "channel": channel,
        "text": f"*[{sig.kind.upper()}] {sig.title}*\n{sig.summary}\n<{sig.url}|source>"
                + (f"\nCVEs: {', '.join(sig.cve_ids)}" if sig.cve_ids else ""),
    })

def to_digest(sig):
    # append to a daily file, send at 09:00 UTC by separate cron
    with open("digest.ndjson", "a") as f:
        f.write(json.dumps({"title": sig.title, "url": sig.url, "kind": sig.kind}) + "\n")

The 30-minute cron loop

# run.py
from sources  import fetch_kev, fetch_cves, fetch_news_cyber
from normalize import from_news, from_kev, from_cve
from dedup    import is_new, mark
from score    import severity
from route    import to_pagerduty, to_slack, to_digest

def run():
    sigs = [from_news(a) for a in fetch_news_cyber()] \
         + [from_kev(k)  for k in fetch_kev()] \
         + [from_cve(c)  for c in fetch_cves_recent()]

    for sig in sigs:
        if not is_new(sig.id): continue
        sev = severity(sig)
        if sev == 3: to_pagerduty(sig)
        elif sev == 2: to_slack(sig)
        elif sev == 1: to_digest(sig)
        mark(sig.id)

That's the whole thing — about 150 lines of code spread across five files plus a config.

Results / what you actually get

After two weeks of running this on a real 30-engineer SaaS team (sample size of one — your mileage will vary):

ChannelVolumeSignal-to-noise (subjective)
PagerDuty (P0)~1 page / monthEvery page was actionable
Slack #sec-alerts (P1)4–6 / day~70% read, ~20% triggered a discussion
Daily digest (P2)30–60 / daySkimmed in 2 minutes once a day

The PagerDuty volume is what makes the system viable. If you're being woken up for non-actionable noise, you'll mute the channel within a week and the system fails. The strict scoring (KEV + critical-vendor match, or breach + customer-name match) is the part that has to be tuned by you, not by a vendor.

Cost

SetupRecurring costTrade-off
This aggregator (news API + KEV + NVD + cron + Slack/PagerDuty)~$30–50 / moMaintain ~150 LOC + a config file
Subscribe to 12 free feeds via Feedly Pro+$96 / yrManual triage, no PagerDuty path
MISP self-hosted$0 + ops costPowerful but heavy; designed for IOC sharing, not SaaS triage
Recorded Future / Mandiant / Anomali$30k–100k+ / yrGeneric enterprise feed, dashboards you won't use, sales cycle

Above ~200 engineers or in a regulated industry (banking, healthcare), commercial TI is justifiable. Below that, the build option wins on relevance and cost simultaneously — which is the rare combination.

FAQ

What is a threat intelligence feed?

A threat intelligence feed is a continuous stream of security signals — vulnerability disclosures, breach reports, exploited CVEs, and indicators of compromise — delivered in a structured format so a security team can detect, prioritize, and respond to threats. The feed's value depends entirely on how well it filters for relevance to your specific stack, customers, and vendors; a generic firehose has near-zero signal-to-noise for a small SaaS.

How do I build a threat intel feed?

To build a threat intelligence feed, combine three free or low-cost sources — the CISA KEV catalog (free), the NVD CVE feed (free), and a structured news API (~$30/mo) — into a normalizer that produces a uniform ThreatSignal schema, dedupe against an SQLite store, score each signal against your vendor and customer list, and route by severity to PagerDuty, Slack, or a daily digest. The whole pipeline is roughly 150 lines of Python plus a YAML config of your dependencies.

What sources should I use for threat intelligence?

The minimum viable sources for a SaaS threat intelligence feed are CISA KEV (actively exploited vulnerabilities, free), the NVD CVE feed (all disclosed CVEs with CVSS scores, free), and a sentiment- and entity-aware news API for breach and vendor-incident coverage. Optional second-tier sources include MITRE ATT&CK, AlienVault OTX, and Abuse.ch — add them only after you have the first three filtered well, otherwise you're stacking firehoses.

How do I monitor cyber threats relevant to my SaaS?

To monitor cyber threats relevant to your SaaS, maintain a config file listing your critical vendors (Postgres, Stripe, Cloudflare, Auth0, etc.), important vendors, top customers by name, and shipped libraries. Score every incoming signal against this list — KEV match against a critical vendor or breach mention plus customer name should page; everything else routes to Slack or a daily digest. Relevance comes from the config, not from the source.

APITube - News API

Related articles

Best Financial News API for Trading 2026: 5 Compared
Insights

Best Financial News API for Trading 2026: 5 Compared

Five financial news APIs scored on latency, ticker-tagging, sentiment, backtesting archive, and trading-event feeds. 2026 fintech-focused comparison.

NewsAPI.org Alternative 2026: Why Devs Pick APITube
Insights

NewsAPI.org Alternative 2026: Why Devs Pick APITube

NewsAPI.org alternative for 2026 — TOS quote, real migration code, 12-month TCO, and when NewsAPI is still fine. APITube vs NewsAPI.org, straight.

How to Scale a News App to Millions of Users (2026 Architecture Guide)
Insights

How to Scale a News App to Millions of Users (2026 Architecture Guide)

Spike-driven traffic, freshness vs cache trade-offs, autoscaling thresholds that actually fit news workloads, a TTL matrix, build-vs-buy cost math from 100K to 100M MAU, and a reference stack. With working ingestion code.

Free and Premium Datasets
Insights

Free and Premium Datasets

Introduction to free and premium datasets available on the API Tube platform. Learn how to access and use these datasets for your projects.

We use cookies

By clicking "Accept", you agree to the storing of cookies on your device for functional and analytics.