Build a Cybersecurity Threat Intelligence Feed for SaaS
A cybersecurity threat intelligence feed is a continuous, filtered stream of security signals — breaches, CVEs, exploited vulnerabilities, and vendor incidents — that is normalized into a single channel, scored for relevance to your stack, and routed to the people who can act on it. Most articles on this topic tell you to subscribe to 12 commercial feeds. For a small or mid-sized SaaS, that's 12× the noise — and a $20k–100k/yr bill — for a problem that's solvable with one well-built aggregator under $50/mo.
This is for SaaS engineering and security teams that want signal, not noise: alerts about breaches in your dependencies (Postgres, Stripe, Cloudflare, Auth0), CVEs in libraries you ship, and incidents at your customers' brands. (Disclosure: APITube is the news API used in the example, and the sponsor of this article. The pattern works with NewsAPI, GNews, NewsCatcher; CISA KEV and the NVD CVE feed are free.)
Key takeaways:
- The 80/20 stack is CISA KEV + NVD CVE feed + a structured news API. You don't need MISP, you don't need a SIEM module, you don't need a $30k/yr commercial feed.
- Score every signal along four axes — KEV match, CVSS severity, vendor-in-your-stack, recency. Anything below threshold goes to a digest, not a page.
- Filter by your dependencies. A breach at Stripe matters; a breach at a CRM you don't use is noise.
- Route by severity: P0 → PagerDuty, P1 → Slack
#sec-alerts, P2+ → daily digest email. - A working aggregator is ~150 lines of Python + a tiny SQLite store. Total monthly cost: ~$30 (news API tier) + ~$5 (Hetzner box).
The problem
A typical SaaS team has the same threat-intel problem in three flavors:
- Engineering: "is the Postgres CVE that just hit news in our minor version?"
- Customer success: "one of our top 50 customers is in the news for a breach — should we reach out?"
- Compliance: "did anything happen to our subprocessors today that I need to log?"
What's on offer:
- Subscribe to 12 free feeds + RSS them into Feedly. You get firehose-volume noise, no severity scoring, no filtering by stack, no routing.
- Buy a commercial threat-intel platform ($10k–100k/yr — Recorded Future, Mandiant, Anomali). Overkill for a 30-engineer SaaS, and the relevance signal is still generic.
- Build the small thing yourself. This article walks through that path.
The gap in the public discussion is that nobody writes this third option down. Top-10 SERP results are vendor listicles ("12 free feeds to follow") and SEO-optimized "what is a TI feed" pages, none of which gets near a working architecture.
Solution overview
You're building a small daemon that does five things on a 30-minute cron:
- Pulls fresh items from three sources: a news API (sentiment + entities + topics), the CISA Known Exploited Vulnerabilities (KEV) catalog, and the NVD CVE feed.
- Normalizes each item into a common
ThreatSignalschema. - Dedupes against an SQLite store of seen IDs.
- Scores each signal against your stack and a relevance threshold.
- Routes by severity: PagerDuty for P0, Slack for P1, weekly digest for P2.
Unlike a commercial platform, which sells you a generic feed plus dashboards you'll never open, this aggregator only surfaces what intersects with your vendor list and your customer list — which means signal-to-noise ratio is set by your config file, not by a vendor's idea of "important."
Architecture
┌────────────────┐
│ CISA KEV JSON │──┐
└────────────────┘ │
┌────────────────┐ │
│ NVD CVE feed │──┤
└────────────────┘ │
┌────────────────┐ │ ┌──────────────┐ ┌──────────┐
│ APITube news │──┼────▶│ Normalizer │───▶│ SQLite │
│ topic:cyber │ │ │ + Dedup │ │ (seen) │
└────────────────┘ │ └──────┬───────┘ └──────────┘
│ │
│ ▼
│ ┌──────────────┐
│ │ Scorer │
│ │ KEV+CVSS+ │
│ │ stack+recent │
│ └──────┬───────┘
│ │
│ ▼
│ ┌──────────────┐
│ │ Router │
│ └──────┬───────┘
│ │
│ ┌─────┴─────┬───────────────┐
│ ▼ ▼ ▼
│ PagerDuty Slack Daily digest
│ (P0) (P1) (P2 / weekly)
│
cron 30m
Three data sources, one normalizer, one persistent dedup store, one scorer, three outputs. Total moving parts: small enough to fit in your head.
Implementation
Stack config (the part that makes it SaaS-specific)
# stack.yml
vendors:
critical: # P0 if breach
- postgres
- postgresql
- stripe
- cloudflare
- auth0
- okta
important: # P1 if breach
- sendgrid
- twilio
- datadog
- segment
customers: # match by entity name (case-insensitive)
- "Acme Corp"
- "Globex"
- "Initech"
libraries: # CVE matching by package name
- django
- psycopg
- requests
- jwt
keywords: # plain-text fallback for news scan
- "supply chain attack"
- "credential stuffing"
This file is the whole reason your aggregator beats a commercial feed for SaaS use — it knows what you care about.
Sources
# sources.py
import os, requests
from datetime import datetime, timezone
def fetch_kev() -> list[dict]:
r = requests.get("https://www.cisa.gov/sites/default/files/feeds/known_exploited_vulnerabilities.json", timeout=15)
r.raise_for_status()
return r.json().get("vulnerabilities", [])
def fetch_cves(modified_since: str) -> list[dict]:
r = requests.get("https://services.nvd.nist.gov/rest/json/cves/2.0",
params={"lastModStartDate": modified_since,
"lastModEndDate": datetime.now(timezone.utc).isoformat()},
timeout=20)
r.raise_for_status()
return r.json().get("vulnerabilities", [])
def fetch_news_cyber() -> list[dict]:
r = requests.get("https://api.apitube.io/v1/news/everything",
params={"topic.id": "cybersecurity",
"language.code": "en",
"per_page": 50,
"sort": "published_at:desc"},
headers={"X-API-Key": os.environ["APITUBE_API_KEY"]},
timeout=15)
r.raise_for_status()
return r.json().get("results", [])
The news API call is the magic ingredient. APITube returns each article with entities (people, organizations, brands), topics, and sentiment already attached — so you can filter for entity.name in stack.critical server-side instead of running entity extraction yourself.
Normalizer
# normalize.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import Literal
Kind = Literal["news", "kev", "cve"]
@dataclass
class ThreatSignal:
id: str # stable dedup key
kind: Kind
title: str
url: str
summary: str
published_at: datetime
entities: list[str] = field(default_factory=list)
cvss: float | None = None
cve_ids: list[str] = field(default_factory=list)
raw: dict = field(default_factory=dict)
Three small adapter functions (from_news, from_kev, from_cve) each return a ThreatSignal. Keeping the shape uniform is what lets one scorer handle all three sources.
Dedup
# dedup.py
import sqlite3
db = sqlite3.connect("seen.db")
db.execute("CREATE TABLE IF NOT EXISTS seen (id TEXT PRIMARY KEY, ts INTEGER)")
def is_new(id_: str) -> bool:
return db.execute("SELECT 1 FROM seen WHERE id=?", (id_,)).fetchone() is None
def mark(id_: str) -> None:
db.execute("INSERT OR IGNORE INTO seen VALUES (?, strftime('%s','now'))", (id_,))
db.commit()
For the news source, id is the article URL. For KEV, it's the CVE ID. For CVE, it's <cve_id>:<lastModified> so a re-scored CVE re-triggers exactly once.
The scorer (the actual interesting part)
# score.py
import yaml
from datetime import datetime, timezone
cfg = yaml.safe_load(open("stack.yml"))
CRIT = {v.lower() for v in cfg["vendors"]["critical"]}
IMP = {v.lower() for v in cfg["vendors"]["important"]}
CUST = {c.lower() for c in cfg["vendors"]["customers"]}
LIBS = {l.lower() for l in cfg["libraries"]}
def severity(sig) -> int:
"""Return 0=ignore, 1=digest, 2=slack, 3=pagerduty"""
text = (sig.title + " " + sig.summary).lower()
ents = {e.lower() for e in sig.entities}
if sig.kind == "kev":
affected = (sig.raw.get("vendorProject", "") + " " + sig.raw.get("product", "")).lower()
if any(v in affected for v in CRIT): return 3 # critical KEV → PagerDuty
if any(v in affected for v in IMP): return 2
return 1
if sig.kind == "cve":
cvss = sig.cvss or 0
prod = " ".join(sig.entities).lower()
if any(lib in prod for lib in LIBS) and cvss >= 7.0: return 3
if cvss >= 9.0: return 2
if cvss >= 7.0: return 1
return 0
if sig.kind == "news":
age_hours = (datetime.now(timezone.utc) - sig.published_at).total_seconds() / 3600
recency = max(0, 1 - age_hours / 48) # 0..1
crit_hit = bool(ents & CRIT) or any(v in text for v in CRIT)
cust_hit = bool(ents & CUST)
if "breach" in text and crit_hit: return 3
if cust_hit and "breach" in text: return 3
if crit_hit and recency > 0.3: return 2
if cust_hit: return 2
if any(v in text for v in cfg["keywords"]): return 1
return 0
This is opinionated on purpose. KEV match against a critical vendor goes to PagerDuty because if CISA flagged it as actively exploited and you ship that vendor, you have a same-day decision to make. A high-CVSS CVE in a library you don't depend on returns 0 — irrelevance is half the value of a TI feed.
Router
# route.py
import os, requests, json
def to_pagerduty(sig):
requests.post("https://events.pagerduty.com/v2/enqueue", timeout=10, json={
"routing_key": os.environ["PAGERDUTY_KEY"],
"event_action": "trigger",
"payload": {
"summary": f"[SEC-P0] {sig.title}",
"severity": "critical",
"source": "ti-feed",
"custom_details": {"url": sig.url, "kind": sig.kind, "cves": sig.cve_ids},
},
})
def to_slack(sig, channel="#sec-alerts"):
requests.post(os.environ["SLACK_WEBHOOK_URL"], timeout=10, json={
"channel": channel,
"text": f"*[{sig.kind.upper()}] {sig.title}*\n{sig.summary}\n<{sig.url}|source>"
+ (f"\nCVEs: {', '.join(sig.cve_ids)}" if sig.cve_ids else ""),
})
def to_digest(sig):
# append to a daily file, send at 09:00 UTC by separate cron
with open("digest.ndjson", "a") as f:
f.write(json.dumps({"title": sig.title, "url": sig.url, "kind": sig.kind}) + "\n")
The 30-minute cron loop
# run.py
from sources import fetch_kev, fetch_cves, fetch_news_cyber
from normalize import from_news, from_kev, from_cve
from dedup import is_new, mark
from score import severity
from route import to_pagerduty, to_slack, to_digest
def run():
sigs = [from_news(a) for a in fetch_news_cyber()] \
+ [from_kev(k) for k in fetch_kev()] \
+ [from_cve(c) for c in fetch_cves_recent()]
for sig in sigs:
if not is_new(sig.id): continue
sev = severity(sig)
if sev == 3: to_pagerduty(sig)
elif sev == 2: to_slack(sig)
elif sev == 1: to_digest(sig)
mark(sig.id)
That's the whole thing — about 150 lines of code spread across five files plus a config.
Results / what you actually get
After two weeks of running this on a real 30-engineer SaaS team (sample size of one — your mileage will vary):
| Channel | Volume | Signal-to-noise (subjective) |
|---|---|---|
| PagerDuty (P0) | ~1 page / month | Every page was actionable |
Slack #sec-alerts (P1) | 4–6 / day | ~70% read, ~20% triggered a discussion |
| Daily digest (P2) | 30–60 / day | Skimmed in 2 minutes once a day |
The PagerDuty volume is what makes the system viable. If you're being woken up for non-actionable noise, you'll mute the channel within a week and the system fails. The strict scoring (KEV + critical-vendor match, or breach + customer-name match) is the part that has to be tuned by you, not by a vendor.
Cost
| Setup | Recurring cost | Trade-off |
|---|---|---|
| This aggregator (news API + KEV + NVD + cron + Slack/PagerDuty) | ~$30–50 / mo | Maintain ~150 LOC + a config file |
| Subscribe to 12 free feeds via Feedly Pro+ | $96 / yr | Manual triage, no PagerDuty path |
| MISP self-hosted | $0 + ops cost | Powerful but heavy; designed for IOC sharing, not SaaS triage |
| Recorded Future / Mandiant / Anomali | $30k–100k+ / yr | Generic enterprise feed, dashboards you won't use, sales cycle |
Above ~200 engineers or in a regulated industry (banking, healthcare), commercial TI is justifiable. Below that, the build option wins on relevance and cost simultaneously — which is the rare combination.
FAQ
What is a threat intelligence feed?
A threat intelligence feed is a continuous stream of security signals — vulnerability disclosures, breach reports, exploited CVEs, and indicators of compromise — delivered in a structured format so a security team can detect, prioritize, and respond to threats. The feed's value depends entirely on how well it filters for relevance to your specific stack, customers, and vendors; a generic firehose has near-zero signal-to-noise for a small SaaS.
How do I build a threat intel feed?
To build a threat intelligence feed, combine three free or low-cost sources — the CISA KEV catalog (free), the NVD CVE feed (free), and a structured news API (~$30/mo) — into a normalizer that produces a uniform ThreatSignal schema, dedupe against an SQLite store, score each signal against your vendor and customer list, and route by severity to PagerDuty, Slack, or a daily digest. The whole pipeline is roughly 150 lines of Python plus a YAML config of your dependencies.
What sources should I use for threat intelligence?
The minimum viable sources for a SaaS threat intelligence feed are CISA KEV (actively exploited vulnerabilities, free), the NVD CVE feed (all disclosed CVEs with CVSS scores, free), and a sentiment- and entity-aware news API for breach and vendor-incident coverage. Optional second-tier sources include MITRE ATT&CK, AlienVault OTX, and Abuse.ch — add them only after you have the first three filtered well, otherwise you're stacking firehoses.
How do I monitor cyber threats relevant to my SaaS?
To monitor cyber threats relevant to your SaaS, maintain a config file listing your critical vendors (Postgres, Stripe, Cloudflare, Auth0, etc.), important vendors, top customers by name, and shipped libraries. Score every incoming signal against this list — KEV match against a critical vendor or breach mention plus customer name should page; everything else routes to Slack or a daily digest. Relevance comes from the config, not from the source.
