Build a Django News Portal in 2026: Full Stack Tutorial

A Django news portal is a full-stack web application that fetches articles from a news API on a schedule, stores them with rich metadata (entities, categories, sentiment), serves them through cached views with infinite scroll, and exposes search across the corpus — built with Django 5, Celery beat, Redis, and HTMX. Unlike beginner Django tutorials that enter dummy news manually in the admin, this guide ships a portal that pulls live articles every 15 minutes and serves them under 50 ms once cached.

This tutorial walks through building the portal end-to-end in about 350 lines of Django code. The data is live from a real news API. The frontend uses HTMX for infinite scroll instead of a JavaScript build pipeline. The last section gives you a decision framework for when each piece of complexity (Celery, Redis, Postgres) is worth adding.

What You'll Build

A working Django news portal with seven components:

Models — Article, Source, Category, Topic with JSONFields for entities and sentiment
Celery beat ingestion — task that pulls new articles every 15 minutes with idempotent upsert
Redis cache — Django cache framework with per-article TTL aligned to publication freshness
Class-based views — ListView + DetailView with cache decorators
HTMX infinite scroll — hx-trigger="revealed" partial template, no JavaScript build
Postgres full-text search — SearchVector + GIN index across title and body
Decision framework — when to add each layer vs starting with SQLite

Who this is for: Django developers building a news aggregator, a brand-monitoring portal, an internal newsroom dashboard, or any content site that consumes a third-party news feed.

Prerequisites

Python 3.12+ and Django 5.1+
A news API key — this guide uses APITube because every article comes back with categories, topics, entities, and sentiment already attached, which keeps the model layer small. Any news API with structured metadata works.
PostgreSQL 16+ and Redis 7+ (skip both at the start; the decision framework section explains when to add them)
Packages:

pip install django celery[redis] redis psycopg httpx django-htmx

export APITUBE_KEY="your_key_here"
export DATABASE_URL="postgres://user:pass@localhost:5432/newsportal"
export CELERY_BROKER_URL="redis://localhost:6379/0"

Step 1 — Models for news articles

A real news portal needs more than a title + body. Each article comes from a source, belongs to one or more categories and topics, mentions named entities, and carries sentiment. Use JSONField for the open-ended bits to avoid premature normalization:

# news/models.py
from django.db import models
from django.contrib.postgres.search import SearchVectorField
from django.contrib.postgres.indexes import GinIndex

class Source(models.Model):
    domain = models.CharField(max_length=255, unique=True)
    country_code = models.CharField(max_length=2, blank=True)

    def __str__(self):
        return self.domain


class Category(models.Model):
    slug = models.SlugField(max_length=80, unique=True)
    name = models.CharField(max_length=120)


class Article(models.Model):
    external_id = models.CharField(max_length=64, unique=True, db_index=True)
    title = models.CharField(max_length=500)
    description = models.TextField(blank=True)
    body = models.TextField(blank=True)
    href = models.URLField(max_length=2000)
    image = models.URLField(max_length=2000, blank=True)
    published_at = models.DateTimeField(db_index=True)
    source = models.ForeignKey(Source, on_delete=models.CASCADE, related_name="articles")
    categories = models.ManyToManyField(Category, related_name="articles", blank=True)

    # Open-ended metadata — APITube returns rich nested structures here
    entities = models.JSONField(default=list, blank=True)
    topics = models.JSONField(default=list, blank=True)
    sentiment = models.JSONField(default=dict, blank=True)

    search_vector = SearchVectorField(null=True, editable=False)

    class Meta:
        ordering = ("-published_at",)
        indexes = [
            GinIndex(fields=["search_vector"], name="article_search_idx"),
            models.Index(fields=["-published_at"], name="article_pub_idx"),
        ]

    def __str__(self):
        return self.title

Two design notes. First, external_id is unique and indexed — that's what makes the upsert in the Celery task idempotent against repeated polls. Second, search_vector is a Postgres-specific field with a GIN index; we'll populate it in the ingestion task so search stays fast as the corpus grows.

Run migrations and register a minimal admin so you can verify ingestion visually:

python manage.py makemigrations news && python manage.py migrate

# news/admin.py
from django.contrib import admin
from .models import Article, Source, Category

admin.site.register([Source, Category])
admin.site.register(Article, list_display=("title", "source", "published_at"))

Step 2 — Celery beat for live news ingestion

This is the part GeeksforGeeks tutorials skip entirely: a real news portal pulls articles automatically, not via the admin form. Celery beat is Django's go-to scheduler.

Configure Celery in your project:

# project/celery.py
import os
from celery import Celery

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "project.settings")
app = Celery("project")
app.config_from_object("django.conf:settings", namespace="CELERY")
app.autodiscover_tasks()

# project/__init__.py
from .celery import app as celery_app
__all__ = ("celery_app",)

# project/settings.py — relevant pieces
CELERY_BROKER_URL = os.environ["CELERY_BROKER_URL"]
CELERY_TIMEZONE = "UTC"
CELERY_BEAT_SCHEDULE = {
    "fetch-news-every-15-min": {
        "task": "news.tasks.fetch_news",
        "schedule": 900.0,  # 15 minutes
    },
}

The fetcher task does three things: pull recent articles, upsert them on external_id, and refresh the search vector for any new rows.

# news/tasks.py
import os
import httpx
from celery import shared_task
from datetime import datetime, timedelta, timezone
from django.contrib.postgres.search import SearchVector
from django.db import transaction
from .models import Article, Source, Category

APITUBE_KEY = os.environ["APITUBE_KEY"]
BASE = "https://api.apitube.io/v1/news/everything"

@shared_task(bind=True, max_retries=3, default_retry_delay=60)
def fetch_news(self, category="medtop:13000000", per_page=50):  # IPTC media-topic id (e.g. technology)
    since = (datetime.now(timezone.utc) - timedelta(minutes=30)).isoformat()
    params = {
        "language.code": "en",
        "category.id": category,
        "published_at.start": since,
        "per_page": per_page,
    }
    try:
        r = httpx.get(BASE, params=params, headers={"X-API-Key": APITUBE_KEY}, timeout=15)
        r.raise_for_status()
    except httpx.HTTPError as exc:
        raise self.retry(exc=exc)

    new_ids = []
    for item in r.json().get("results", []):
        source, _ = Source.objects.get_or_create(
            domain=item["source"]["domain"],
            defaults={"country_code": item["source"].get("location", {}).get("country_code", "")[:2]},
        )
        with transaction.atomic():
            article, created = Article.objects.update_or_create(
                external_id=str(item["id"]),
                defaults={
                    "title": item["title"][:500],
                    "description": item.get("description", ""),
                    "body": item.get("body", ""),
                    "href": item["href"],
                    "image": item.get("image", "") or "",
                    "published_at": item["published_at"],
                    "source": source,
                    "entities": item.get("entities", []),
                    "topics": item.get("topics", []),
                    "sentiment": item.get("sentiment", {}),
                },
            )
            if created:
                new_ids.append(article.pk)
            cats = [Category.objects.get_or_create(slug=c["id"], defaults={"name": c.get("name", c["id"])})[0]
                    for c in item.get("categories", [])]
            article.categories.set(cats)

    if new_ids:
        Article.objects.filter(pk__in=new_ids).update(
            search_vector=SearchVector("title", weight="A") + SearchVector("body", weight="B")
        )
    return {"fetched": len(r.json().get("results", [])), "new": len(new_ids)}

Two production touches. The 30-minute lookback window with 15-minute scheduling guarantees overlap, so a missed run doesn't lose articles. The update_or_create keyed on external_id makes re-fetches idempotent — same article can arrive in three consecutive polls and you'll never duplicate.

Start the workers:

celery -A project worker -l info
celery -A project beat -l info

Step 3 — Redis cache for hot articles

The article-detail view runs the same database query thousands of times per hour for any popular article. Django's cache framework in front of Redis fixes that.

Configure the cache:

# settings.py
CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.redis.RedisCache",
        "LOCATION": os.environ.get("REDIS_URL", "redis://127.0.0.1:6379/1"),
    }
}

Cache article objects with a TTL that decays with article age — a 5-minute-old breaking story should refresh more often than a week-old archive piece:

# news/cache.py
from django.core.cache import cache
from datetime import datetime, timezone
from .models import Article

def article_ttl(published_at):
    age_hours = (datetime.now(timezone.utc) - published_at).total_seconds() / 3600
    if age_hours < 1: return 60          # 1 min — breaking
    if age_hours < 24: return 300        # 5 min — recent
    return 3600                          # 1 hour — older

def get_article_cached(external_id):
    key = f"article:{external_id}"
    article = cache.get(key)
    if article is None:
        article = Article.objects.select_related("source").prefetch_related("categories").get(external_id=external_id)
        cache.set(key, article, timeout=article_ttl(article.published_at))
    return article

Cache invalidation on update is a one-liner via signal:

# news/signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
from django.core.cache import cache
from .models import Article

@receiver(post_save, sender=Article)
def invalidate_article_cache(sender, instance, **kwargs):
    cache.delete(f"article:{instance.external_id}")

Wire signals in apps.py so they load at startup:

class NewsConfig(AppConfig):
    default_auto_field = "django.db.models.BigAutoField"
    name = "news"
    def ready(self):
        from . import signals  # noqa

Step 4 — Class-based views

Django's ListView and DetailView cover the two pages you need. Keep them thin — heavy lifting belongs in models and managers.

# news/views.py
from django.views.generic import ListView, DetailView
from django.shortcuts import render
from django.views.decorators.cache import cache_page
from django.utils.decorators import method_decorator
from .cache import get_article_cached
from .models import Article

PAGE_SIZE = 20

@method_decorator(cache_page(60), name="dispatch")  # 60s page cache for the list
class ArticleListView(ListView):
    model = Article
    template_name = "news/list.html"
    context_object_name = "articles"
    paginate_by = PAGE_SIZE

    def get_queryset(self):
        return Article.objects.select_related("source").only(
            "id", "external_id", "title", "description", "image", "published_at", "source"
        )


class ArticleDetailView(DetailView):
    template_name = "news/detail.html"
    context_object_name = "article"

    def get_object(self, queryset=None):
        return get_article_cached(self.kwargs["external_id"])

URL config:

# news/urls.py
from django.urls import path
from .views import ArticleListView, ArticleDetailView, ArticleListPartial, ArticleSearchView

urlpatterns = [
    path("", ArticleListView.as_view(), name="article-list"),
    path("page/", ArticleListPartial.as_view(), name="article-list-partial"),
    path("search/", ArticleSearchView.as_view(), name="article-search"),
    path("<str:external_id>/", ArticleDetailView.as_view(), name="article-detail"),
]

Step 5 — HTMX infinite scroll (no JavaScript build)

Top-3 SERP tutorials default to plain templates with full-page reloads — not what 2026 readers expect. HTMX gives you infinite scroll in 15 lines of HTML, with no React, no Webpack, no build step.

Install HTMX in your base template:

<!-- templates/base.html -->
<head>
  <script src="https://unpkg.com/[email protected]" defer></script>
</head>

The list view template renders the first page, then asks HTMX to fetch the next page when the sentinel scrolls into view:

<!-- templates/news/list.html -->
{% extends "base.html" %}
{% block content %}
<h1>Latest News</h1>
<div id="articles">
  {% include "news/_articles.html" %}
</div>
{% endblock %}

<!-- templates/news/_articles.html -->
{% for article in articles %}
  <article class="card">
    <h2><a href="{% url 'article-detail' article.external_id %}">{{ article.title }}</a></h2>
    <p>{{ article.description|truncatechars:160 }}</p>
    <small>{{ article.source.domain }} — {{ article.published_at|date:"j M, H:i" }}</small>
  </article>
{% endfor %}

{% if page_obj.has_next %}
<div hx-get="{% url 'article-list-partial' %}?page={{ page_obj.next_page_number }}"
     hx-trigger="revealed"
     hx-swap="outerHTML">
  Loading...
</div>
{% endif %}

The partial view returns just the article block plus the next sentinel:

# news/views.py — append
from django.views.generic import ListView

class ArticleListPartial(ArticleListView):
    template_name = "news/_articles.html"

That's the complete pattern. When the sentinel div enters the viewport, HTMX fires a GET, swaps the response into its own slot, and the new response brings its own next-page sentinel. No JavaScript you wrote.

Step 6 — Postgres full-text search

Plain Article.objects.filter(title__icontains=q) collapses at ~50,000 articles. Postgres full-text search with the GIN index from Step 1 stays sub-100ms past a million rows.

# news/views.py — append
from django.views.generic import ListView
from django.contrib.postgres.search import SearchQuery, SearchRank
from django.db.models import F

class ArticleSearchView(ListView):
    template_name = "news/search.html"
    context_object_name = "articles"
    paginate_by = PAGE_SIZE

    def get_queryset(self):
        q = self.request.GET.get("q", "").strip()
        if not q:
            return Article.objects.none()
        query = SearchQuery(q, search_type="websearch")
        return (
            Article.objects.annotate(rank=SearchRank(F("search_vector"), query))
            .filter(search_vector=query)
            .order_by("-rank", "-published_at")
            .select_related("source")
        )

search_type="websearch" accepts Google-style operators ("exact phrase", -exclude, OR) without you parsing anything. The SearchRank ordering surfaces the most relevant articles first; secondary published_at sort breaks ties toward freshness.

The search template is just a form pointing at this view, then renders the same _articles.html partial — pagination and HTMX infinite scroll work for free on search results too.

Decision framework: when to add each layer

A common Django mistake is to provision Postgres + Redis + Celery on day one for a portal that has 12 articles and 3 users. The right order:

Stage	Articles	Stack	Why
Prototype	< 1,000	SQLite + sync `fetch_news` via cron	Zero ops; ship in a day
Early production	1k – 50k	Postgres + sync fetch via cron	FTS justifies Postgres; no Redis yet
Real traffic	> 50k articles or > 1k DAU	+ Celery beat + Redis cache	Async fetch protects request latency; cache absorbs reads
Multi-source	> 5 fetchers	+ Celery worker pool, separate beat container	Isolate fetcher failures from web tier

Two rules behind the table. Don't add Celery until your synchronous fetch starts blocking web requests — until then, a cron job calling a management command is simpler and equally effective. Don't add Redis caching until you can measure repeat reads on the same article in your logs; for low-traffic sites the cache costs more in operational complexity than it saves in database load.

Frequently Asked Questions

How do you build a news website with Django?

To build a news website with Django, define an Article model with fields for title, body, source, published date, categories, and metadata. Schedule a Celery beat task that polls a news API every 15 minutes and upserts articles by external ID. Render lists with ListView, cache hot articles in Redis via Django's cache framework, add HTMX-driven infinite scroll for the feed, and enable Postgres full-text search across titles and bodies.

Can Django pull data from a REST API?

Yes — Django can pull data from any REST API using httpx or requests inside a management command or Celery task. The standard pattern is a Celery beat schedule that fires the fetcher every N minutes, uses Model.objects.update_or_create keyed on the upstream API's stable identifier to keep upserts idempotent, and wraps each insert in a transaction. Retry transient failures with @shared_task(bind=True, max_retries=3).

How do you schedule periodic tasks in Django?

Schedule periodic Django tasks with Celery beat, the scheduler bundled with Celery. Add a CELERY_BEAT_SCHEDULE dict to settings with the task path and schedule (in seconds, or a crontab instance), then run celery -A project beat alongside your worker process. For very simple cases without Celery, a system cron job calling python manage.py custom_command is also valid.

What is the best way to cache news content in Django?

The best way to cache news content in Django is the framework's cache layer backed by Redis, with a per-article TTL that decays based on article age. A 1-minute TTL for breaking stories under an hour old, 5 minutes for recent articles up to 24 hours, and 1 hour for archives. Invalidate via a post_save signal so an upserted article evicts its cache entry immediately.

Build a Django News Portal in 2026: Full Stack Tutorial

Kent Hudson

Build a Django News Portal in 2026: Full Stack Tutorial

What You'll Build

Prerequisites

Step 1 — Models for news articles

Step 2 — Celery beat for live news ingestion

Step 3 — Redis cache for hot articles

Step 4 — Class-based views

Step 5 — HTMX infinite scroll (no JavaScript build)

Step 6 — Postgres full-text search

Decision framework: when to add each layer

Frequently Asked Questions

How do you build a news website with Django?

Can Django pull data from a REST API?

How do you schedule periodic tasks in Django?

What is the best way to cache news content in Django?

相关文章

您应该了解的 10 个新闻 API 过滤器模式（2026）</h4><h2></h2>

新闻 API 快速入门：5 分钟内完成您的第一个请求</h4><h2></h2>

Python 中的 Telegram 新闻机器人 (2026)：aiogram APScheduler

Next.js 多语言应用教程 2026：UI 内容 i18n</h4><h2></h2>

业务解决方案

博客

获取编码

综合服务

Build a Django News Portal in 2026: Full Stack Tutorial

Kent Hudson

Build a Django News Portal in 2026: Full Stack Tutorial

What You'll Build

Prerequisites

Step 1 — Models for news articles

Step 2 — Celery beat for live news ingestion

Step 3 — Redis cache for hot articles

Step 4 — Class-based views

Step 5 — HTMX infinite scroll (no JavaScript build)

Step 6 — Postgres full-text search

Decision framework: when to add each layer

Frequently Asked Questions

How do you build a news website with Django?

Can Django pull data from a REST API?

How do you schedule periodic tasks in Django?

What is the best way to cache news content in Django?

相关文章

您应该了解的 10 个新闻 API 过滤器模式（2026）</h4><h2></h2>

新闻 API 快速入门：5 分钟内完成您的第一个请求</h4><h2></h2>

Python 中的 Telegram 新闻机器人 (2026)：aiogram APScheduler

Next.js 多语言应用教程 2026：UI 内容 i18n</h4><h2></h2>