Linux 软件免费装

Botkibble

开发者 gregrandall
更新时间 2026年2月27日 05:19
PHP版本: 8.2 及以上
WordPress版本: 6.9
版权: GPL-2.0-only
版权网址: 版权信息

标签

markdown api ai crawlers agents

下载

1.2.1 1.3.0

详情介绍:

AI agents, LLMs, and crawlers have to wade through navigation bars, sidebars, ads, and comment forms to reach the content they want, and every element costs tokens. Cloudflare measured an 80% reduction in token usage when converting a blog post from HTML to Markdown (16,180 tokens down to 3,150). Botkibble adds a Markdown endpoint to every published post and page. Cloudflare offers Markdown for Agents at the CDN edge on Pro, Business, and Enterprise plans. Botkibble does the same thing (for free) at the origin, so it works on any host. GitHub Repository Three ways to request Markdown: What's in every response: Performance: Botkibble writes Markdown to disk on the first request, then serves it as a static file. A built-in Fast-Path serves cached files during WordPress's init hook, before the main database query runs. No extra configuration needed. Add a web server rewrite rule and Botkibble bypasses PHP entirely, serving .md files the same way a server would serve an image or CSS file: | Method | Avg. response time | |---|---| | Standard HTML | 0.97s | | Markdown (cold, first request) | 0.95s | | Markdown (cached, PHP Fast-Path) | 0.87s | | Markdown (Nginx/Apache direct) | 0.11s | Serving directly from disk is 88% faster than a full WordPress page load. See the Performance section below for Nginx and Apache configuration. Security: Cache variants (optional): You can persist alternate cached representations by adding ?botkibble_variant=slim (or any other variant name). Variant caches are stored under: /wp-content/uploads/botkibble/_v//.md What it does NOT do:

安装:

  1. Upload the botkibble directory to wp-content/plugins/.
  2. Run composer install inside the plugin directory to install dependencies.
  3. Activate the plugin through the Plugins menu in WordPress.
  4. That's it. No settings page needed.
Test it: curl https://gregr.org/great-hvac-meltdown/?format=markdown curl https://gregr.org/great-hvac-meltdown.md curl -H "Accept: text/markdown" https://gregr.org/great-hvac-meltdown/

常见问题:

How do I add support for custom post types?

The plugin only serves posts and pages by default. To add a custom post type, use the botkibble_served_post_types filter in your theme or a custom plugin: add_filter( 'botkibble_served_post_types', function ( $types ) { $types[] = 'docs'; return $types; } ); Be careful. Only add post types that contain public content. Do not expose post types that may contain private or sensitive data (e.g. WooCommerce orders).

What does the YAML frontmatter include?

Every response starts with a YAML block containing:

  • title — the post title
  • date — publish date in ISO 8601 format
  • type — post type (e.g. post, page)
  • word_count — word count of the Markdown body
  • char_count — character count of the Markdown body
  • tokens — estimated token count (word_count × 1.3)
  • categories — array of category names (posts only)
  • tags — array of tag names (posts only, omitted if none)
Example: title: My Post Title date: '2025-06-01T12:00:00+00:00' type: post word_count: 842 char_count: 4981 tokens: 1095 categories: - Technology tags: - wordpress - markdown

How do I add custom fields to the frontmatter?

Use the botkibble_frontmatter filter: add_filter( 'botkibble_frontmatter', function ( $data, $post ) { $data['excerpt'] = get_the_excerpt( $post ); return $data; }, 10, 2 );

How do I change the Content-Signal header?

Use the botkibble_content_signal filter: add_filter( 'botkibble_content_signal', function ( $signal, $post ) { return 'ai-train=no, search=yes, ai-input=yes'; }, 10, 2 ); Return an empty string to omit the header entirely.

Can I change the token count estimation?

Yes, use the botkibble_token_multiplier filter. The default multiplier of 1.3 (word count × 1.3) comes from Cloudflare's implementation: add_filter( 'botkibble_token_multiplier', function () { return 1.5; // Adjusted for a different model's tokenizer } );

How do I adjust the rate limit?

Cache misses (when a post needs to be converted) are limited to 20 per minute globally. You can change this with the botkibble_regen_rate_limit filter: add_filter( 'botkibble_regen_rate_limit', function () { return 50; } );

Can I add custom HTML cleanup rules?

Yes, use the botkibble_clean_html filter. This runs after the default cleanup and before conversion: add_filter( 'botkibble_clean_html', function ( $html ) { // Remove a plugin's wrapper divs $html = preg_replace( '/ (.*?)<\/div>/s', '$1', $html ); return $html; } );

Can I strip script nodes during conversion?

Yes. Botkibble keeps converter node removal disabled by default (for backward compatibility), but you can opt in with botkibble_converter_remove_nodes: add_filter( 'botkibble_converter_remove_nodes', function ( $nodes ) { $nodes = is_array( $nodes ) ? $nodes : []; $nodes[] = 'script'; return $nodes; } ); If you also need application/ld+json, extract it in botkibble_clean_html first, then let converter-level script removal clean up any remaining script tags.

How do I modify the body before metrics are calculated?

Use the botkibble_body filter. This is the best place to add content like ld+json that you want included in the word count and token estimation: add_filter( 'botkibble_body', function ( $body, $post ) { $json_ld = '...'; return $body . "\n\n" . $json_ld; }, 10, 2 );

How do I modify the final Markdown output?

Use the botkibble_output filter to append or modify the text after conversion: add_filter( 'botkibble_output', function ( $markdown, $post ) { return $markdown . "\n\n---\nServed by Botkibble"; }, 10, 2 );

Can I cache multiple Markdown variants (e.g. a slim version)?

Yes. Add ?botkibble_variant=slim when requesting Markdown to generate and serve a separate cached file. To ensure your variants are invalidated on post updates, return them from the botkibble_cache_variants filter: add_filter( 'botkibble_cache_variants', function ( $variants, $post ) { $variants[] = 'slim'; return $variants; }, 10, 2 );

Can I disable the Accept header detection?

Yes, if you only want to serve Markdown via explicit URLs (.md or ?format=markdown), use the botkibble_enable_accept_header filter: add_filter( 'botkibble_enable_accept_header', '__return_false' );

Does the .md suffix work with all permalink structures?

It works with the most common structures (post name, page hierarchy). Complex date-based permalink structures may require the query parameter or Accept header method instead.

What about password-protected posts?

They return a 403 Forbidden response. There's no point serving a password form to a bot.

What are the response headers?

  • Content-Type: text/markdown; charset=utf-8
  • Vary: Accept — tells caches that responses vary by Accept header
  • X-Markdown-Tokens: <count> — estimated token count (word_count × 1.3)
  • X-Robots-Tag: noindex — prevents search engines from indexing the Markdown version
  • Link: <url>; rel="canonical" — points search engines to the original HTML post
  • Link: <url>; rel="alternate" — advertises the Markdown version for discovery
  • Content-Signal: ai-train=no, search=yes, ai-input=yes — see contentsignals.org

更新日志:

1.3.0 1.2.1 1.2.0 1.1.2 1.1.0 1.0.0