-
Bleach is an HTML sanitizing library designed to strip disallowed tags and attributes based on a whitelist, and can additionally autolinkify URLs in text with an extra filter layer that Django's urlize filter doesn't have.
-
A Python and PHP implementations of a HTML parser based on the WHATWG HTML5 specification for maximum compatibility with major desktop web browsers.
Note that the separate ports are not kept in sync; they are effectively different projects offering similar functionality for their respective languages.