软件包：python3-html-text（0.7.0-1.1）

python3-html-text 的相关链接

Debian 的资源：

下载源码包 html-text：

维护者：

Christian Marillat (QA 页面)

外部的资源：

主页 [github.com]

相似软件包：

试制（Experimental）软件包

警告：这个软件包来自于 experimental 发行版。这表示它很有可能表现出不稳定或者出现 bug ，甚至是导致资料损失。请务必在使用之前查阅 changelog 以及其他潜在的文档。

extract text from HTML.

How is html_text different from .xpath('//text()') from LXML or .get_text() from Beautiful Soup ?

 * Text extracted with html_text does not contain inline styles,
   javascript, comments and other text that is not normally visible to
   users;
 * html_text normalizes whitespace, but in a way smarter than
   .xpath('normalize-space()), adding spaces around inline elements (which
   are often used as block elements in html markup), and trying to avoid
   adding extra spaces for punctuation;
 * html-text can add newlines (e.g. after headers or paragraphs), so that
   the output text looks more like how it is rendered in browsers.

其他与 python3-html-text 有关的软件包

依赖

推荐

建议

增强

dep: python3

交互式高级面向对象语言（默认 python3 版本）
dep: python3-lxml

pythonic binding for the libxml2 and libxslt libraries
dep: python3-lxml-html-clean

blocklist-based HTML cleaner

下载 python3-html-text

下载可用于所有硬件架构的
硬件架构	软件包大小	安装后大小	文件
all	10.0 kB	40.0 kB	[文件列表]