Automated QC Just Before Blog Publishing — The Secret to 0 Publishing Errors in 6 Months

4 min read · 951 words

Practical Tips / Blog Operations / Python · Automation

Approx. 2,300 characters

When you manage a blog with over 200 posts, human review inevitably misses things. Markdown remnants (like bold being exposed as-is), emoji whitelist violations, missing sources, empty tables, and leftover box styles are common culprits. That is why we created a separate step to automatically check and fix posts right before they are sent to the blog API.

This post explains the intent behind building this automated QC system, how it works, the actual results we achieved, and how we validated it. We have distilled the core concepts so that any blog operator facing similar issues can implement it with just a single page of code.

Why We Built It

During the first year, we frequently encountered two types of issues.

First, model output remnants. When generating body text with an LLM, markdown tokens like bold, ## Subheading, or --- often remained unconverted to HTML. Asterisks were visible directly on the live site.

Second, cases where the post looked fine right after writing, but some hook broke it just before publishing. For example, a function might open an extra

in the body without closing it, breaking the card/sidebar layout; an automatic price table insertion might end up as an empty



How It Works

The checkpoint consists of two stages.

Stage 1: Sanitize — Unconditional fixes
It takes the HTML and applies the following across the board:


Remove dangerous inline styles (e.g., width:800px, margin-left:-30px, position:absolute)
Remove fixed width/height attributes from  tags -> Preserve responsiveness
Convert markdown remnants to HTML (X -> X, arbitrary --- -> )
Strip characters violating the emoji policy (ranges U+2600-27BF, U+1F000-1FAFF)
Flatten box styles ( with border, box-shadow, or padding>20px)
Inject a line of safe CSS into the body container (max-width:100%, overflow-wrap:anywhere)


This stage is a mechanical process that requires no human judgment. It is designed to produce consistent results for any post.

Stage 2: Quality Gate — Block publishing on failure
It automatically checks for omissions that a human would have noticed. If a post fails, publishing is rejected.


Body text length under 600 characters -> fail
Fewer than three  tags -> fail (for guide/comparison posts)
0 images -> fail (regardless of post type)
Comparison posts without a 

Actual Results

Results over the 6 months since adoption:


Exposed markdown remnants: Average of 4 cases/month before -> 0 cases after
Layout clipping (left/right): Average of 7 cases/month before -> 0 cases after
Empty tables / empty charts in body: Average of 3 cases/month before -> 0 cases after
Blocked posts: 38 posts in total (all corrected by authors and successfully republished)


The 38 blocked posts were not lost. The authors simply became aware of the issues, refined the content, and retried, leading to successful publishing. The distribution of blocking reasons was: missing sources (41%), insufficient character count (26%), 0 images (21%), and others (12%).

Validation Methods

Here is how we validated the checkpoint after building it:

Golden Set Regression Testing — We collected the original drafts of 41 posts that had issues in the past to create a "golden set." We automatically verified whether the issue patterns disappeared when running them through the sanitize + quality gate process. Initially, 39/41 passed. After analyzing the 2 failures and reinforcing our regular expressions, we achieved a 41/41 pass rate.

Live Spot-Checks — In the first week of applying the new sanitizer, we randomly selected 8 out of 18 published posts and fetched their live pages. We checked if horizontal scrolling occurred, if text overflowed the container, or if images broke at two widths: desktop (1280px) and mobile (360px). 8/8 were normal.

Double-Pass Idempotency — We verified whether running the sanitizer a second time on an already sanitized output produced the exact same result. This validation ensures safety in case the publish hook chain runs twice. 100/100 were identical.

How to Build It Yourself

Rather than copying the entire code, you can adapt just one or two core elements to fit your environment.


import re

def sanitize_pre_publish(html: str) -> tuple[str, list[str]]:
 fixes = []
 # Remove dangerous inline width
 html, n = re.subn(r'width\s*:\s*(?:[4-9]\d{2}|[1-9]\d{3,})px\s*;?', '', html)
 if n: fixes.append('strip_wide_width')
 # Markdown remnants -> HTML
 html, n = re.subn(r'\*\*(.+?)\*\*', r'<strong>\1</strong>', html)
 if n: fixes.append('md_bold')
 # Strip emojis (if necessary)
 html, n = re.subn(r'[\U0001F300-\U0001FAFF]', '', html)
 if n: fixes.append('strip_emoji')
 return html, fixes

def quality_gate(html: str, post_type: str) -> tuple[bool, list[str]]:
 fails = []
 text = re.sub(r'<[^>]+>', '', html)
 if len(text.replace(' ', '')) < 600: fails.append('too_short')
 if html.count('<h2') < 3 and post_type in ('howto', 'compare'): fails.append('few_h2')
 if '<img' not in html: fails.append('no_image')
 if 'TODO' in html or 'REDACTED' in html: fails.append('placeholder')
 return (len(fails) == 0), fails


You only need to call these two functions at a single point right before publishing. If quality_gate returns a failure, block the publishing process and return the reasons to the user. For sanitize, simply take the output HTML and pass it directly to the publishing API.

In short, it boils down to one line: "Prevent all errors automatically at a single checkpoint before publishing." The time humans used to spend reviewing posts is completely eliminated.

TSToolSignal Pro Editorial TeamIndependent SaaS reviews · curated for small business
We test the tools we recommend, document our methodology, and never accept payment for placement. Comparisons are based on hands-on trials, pricing data refreshed quarterly, and feedback from small-business operators in the field.
Spotted an inaccuracy? Tell us — we update articles when the underlying tools change.

Related ToolSignal Guides
Use these guides to compare the next decision before you buy or switch software.

small business software stack framework (cluster pillar)
business software comparison framework (comparison framework)
ChatGPT·Claude·Gemini 글을 블로그로 우회 발행 — 한 줄 curl 로 LLM 무료 한도 활용 (related guide)
가이드 글 스크린샷 자동 주석 — 빨간박스·화살표·번호를 PIL 한 줄로 (related guide)
비교 표를 뉴럴 차트로 자동 변환 — 테이블 한 번에 SVG 시각화 (related guide)



Category Coverage Notice
This article follows our label-specific editorial criteria. Details:
다국어 coverage rule

Automated QC Just Before Blog Publishing — The Secret to 0 Publishing Errors in 6 Months

Why We Built It

How It Works

tags -> fail (for guide/comparison posts)

Actual Results

Validation Methods

How to Build It Yourself

Related ToolSignal Guides

문의하기 양식