XML Sitemap Generator: Community Edition Help

Settings Explained

Search Engine Updates

  • Add sitemap URL to the virtual robots.txt file: By activating this feature, the plugin will add your sitemap URL to the virtual robots.txt file. This allows search engines that don’t support the ping notification feature (such as Baidu or Yandex) to discover your sitemap. WordPress generates your sitemap, so make sure that there is no robots.txt file saved in your blog directory!
  • The IndexNow Protocol has become the default method of notifying Microsoft Bing, Seznam.cz, Naver, and Yandex search engines in real-time about changes to your website. All editions of our plugin also support this protocol. Review the FAQs about IndexNow.
  • The Sitemap Ping Protocol has been deprecated by Microsoft and Google. The LastMod attribute is the manner by which search engines become aware of updated content on your website. Add your site to Google Search Console and Microsoft Bing Webmaster Tools, to ensure that search engines can provide crawling diagnostics and feedback for your site – a robots.txt hint is not enough.
  • Notify Google about updates of your Blog: This feature automatically notifies Google every time you publish a new post or edit an existing one. After being notified, Google will fetch your sitemap and index your new post soon.
  • Notify Bing about updates of your Blog: This feature automatically notifies Bing every time you publish a new post or edit an existing one. After being notified, Bing will fetch your sitemap and index your new post soon. As Bing powers Yahoo Search, your posts should also appear on Yahoo soon.

Advanced Options

  • Try to increase the memory limit: This feature allows you to increase the memory limit in case you ever encounter an out-of-memory error while requesting your sitemap. This option is generally unnecessary but can come in handy in certain circumstances.
  • Try to increase the execution time limit: Similar to the memory limit, this option allows you to set the maximum execution time limit for generating the sitemap.
  • Include a XSLT stylesheet: The XML sitemap generated by the plugin can be hard to read by humans. To address this, the plugin has a default stylesheet that makes it more readable. You can also enter the full URL to your stylesheet, but make sure it’s on the same domain.
  • Override the base URL of the sitemap: If you have installed WordPress in a subdirectory and want your sitemap to appear in the root domain, you can use this option. Please refer to the help page for more information.
  • Include sitemap in HTML format: Activating this option generates a sitemap in HTML format, which can be helpful for bots that don’t understand the XML standard.
  • Override the file name of the sitemap: By default, this plugin generates sitemaps with file names based on your website’s domain and the type of sitemap (e.g., sitemap.xml, sitemap-pages.xml, etc.). However, if you want to customize the file name for any reason, you can use this setting to do so.
  • Allow anonymous statistics: This feature sends anonymous statistics to the plugin author, such as plugin version, WordPress version, PHP version, language, and the number of posts (in steps of 50). This data helps the author optimize the plugin for popular WordPress/PHP versions and improve translations for common languages. Please note that no personal information such as blog URL, title, name, or email address is ever sent, and there is no way to find out who is using the plugin for what.

Additional Pages

  • This feature allows you to add files or URLs that are not part of your WordPress blog to your sitemap. For example, if your blog is located at www.example.com/blog, but you want to include your homepage at www.example.com, you can add it here.

Note: If you want to add pages not located in or beneath the blog directory and your blog is in a subdirectory, you must place your sitemap file in the root directory.

Post Priority

  • Do not use automatic priority calculation: This option will set all posts with the same priority in your sitemap. You can manually set the priority for each post under the “Priorities” tab.
  • Comment Count: This option calculates the priority of each post based on the number of comments it has received.
  • Comment Average: Similar to Comment Count, this option also calculates the priority based on the number of comments each post has received. However, it uses the average number of comments per post as the calculation base.

Sitemap Content

  • WordPress standard content: Check the items you want to include in your sitemap.
  • Custom taxonomies: Check all the custom taxonomies you would like to include.
  • Custom post types: Check all the custom post types you want to include.
  • Include the last modification time: This will add the last modification date to all your entries in the sitemap. Search engines can use this information to revisit the page again if it has changed. It is strongly recommended to keep this option activated.

Exclude Items

  • Excluded categories: This option allows you to exclude a category from appearing in your sitemap. If you have specific categories that you don’t want to include in your sitemap, you can select them here.
  • Exclude posts: Use this option to exclude specific posts from your sitemap. To exclude a post, enter the ID of the post, which can be found under the “Edit Post” screen in WordPress. If you want to exclude multiple posts, simply separate their IDs with a comma.

Change Frequencies / Priorities

  • Change frequencies: This setting allows you to indicate to search engines how often the content on your blog changes. Remember that it is ultimately up to the search engine to decide how often to revisit older content.
  • Priorities: This setting allows you to indicate to search engines the relative importance of your blog’s content. Note that the value you set for each entry is always in relation to all the other content on your blog, so setting everything to the highest priority (1.0) may not be useful.

Filters

Customize the behavior of the generator using your theme’s functions.php file or create an entirely separate plugin. From version 4.1.20+, the following filters are available:

Customize the creation date sort order for URLs
By default, URLs are listed newest first in sitemaps. You may modify this behavior if desired.

/*
 * Return the string with "DESC" or "ASC" values for sorting in sitemap
 *
 * @return string The string with "DESC" or "ASC" values
 */
function sm_sitemap_sort_order() {
    return 'DESC';
}
add_filter( 'sm_sitemap_sort_order', 'sm_sitemap_sort_order', 10, 1 );

Override Robots.txt Disallow Statements
By default the plugin will omit any URLs that appear in the robots.txt and specify the IDs (if they exist in WordPress) in the

/*
Include URLs in sitemaps that were disallowed in the robots.txt file.
*/

add_filter( 'sm_robots_disallowed_ids', 'sm_robots_disallowed_ids', 10 );
function sm_robots_disallowed_ids( $rules ) {
    
    $remove_from_excludes_ids = [ '0' ];

    foreach ( $remove_from_excludes_ids as $id ) {
        if ( ( $key = array_search( $id, $rules ) ) !== false ) {
            unset( $rules[ $key ] );
        }
    }

    return $rules;
}

Customize the number of URLs per sitemap

/**
 * Alters the number of entries in each XML sitemap.
 *
 * @return integer The maximum entries per sitemap.
 */

function sm_sitemap_entries_per_page() {
    return 10;
}
add_filter( ‘sm_sitemap_entries_per_page’, ‘sm_sitemap_entries_per_page’ );

Filter the urlset element

/**
 * Filters the `urlset` for all sitemaps
 *
 * @param string $urlset The output for the sitemap's `urlset`.
 */

function sm_sitemap_urlset( $urlset ) { 
    return str_replace( '>', ' xmlns:xhtml="http://www.w3.org/1999/xhtml">', $urlset );
}
add_filter( 'sm_sitemap_urlset', 'sm_sitemap_urlset', 10, 1 );

Alter the URL of a sitemap entry

/**
 * Alters the URL structure for an example custom post type, "news"
 *
 * @param string  $url  The URL to modify.
 * @param WP_Post $post The post object.
 *
 * @return string The modified URL.
 */

function sm_xml_sitemap_post_url( $url, $post ) {
    if ( $post->post_type === 'news' ) {
        return \str_replace( 'news', 'news-replaced', $url );
    }

    return $url;
}
add_filter( 'sm_xml_sitemap_post_url', 'sm_xml_sitemap_post_url', 10, 2 );

Add additional/external XML sitemaps to the XML sitemap index

/**
 * Includes an additional/custom XML sitemap entry in the XML sitemap index.
 *
 * @param string $sitemap_custom_items XML describing one or more custom sitemaps.
 *
 * @return string The XML sitemap index with the additional XML.
 */

function sm_sitemap_index( $sitemap_custom_items ) {
    $sitemap_custom_items[] = [
        'title' => 'external-sitemap',
        'modified' => '2024-05-31 13:58:13'
    ];

    $sitemap_custom_items[] = [
        'title' => 'external-sitemap-2',
        'modified' => '2024-06-30 13:58:13'
    ];

    return $sitemap_custom_items;
}
add_filter( 'sm_sitemap_index', 'sm_sitemap_index' );

Add a custom post type

/**
 * Include a post type in the sitemaps
 *
 * @return array The slugs of post types to include.
 */

function sm_sitemap_include_post_type() {
    return [ 'news' ];
}
add_filter( 'sm_sitemap_include_post_type', 'sm_sitemap_include_post_type' );

Exclude specific posts from sitemaps

/**
* Excludes posts from sitemaps
*
* @return array The IDs of posts to exclude.
*/

function exclude_posts_from_xml_sitemaps() {
return [ 13 ];
}
add_filter( 'sm_exclude_from_sitemap_by_post_ids', 'exclude_posts_from_xml_sitemaps' );

Exclude a taxonomy (including tags) term

/**
 * Excludes terms with ID of 9 and 10 from terms sitemaps
 *
 * @param array $terms Array of term IDs already excluded.
 *
 * @return array The terms to exclude.
 */

function sm_exclude_from_sitemap_by_term_ids( $terms ) {
    return [ 9, 10 ];
}
add_filter( 'sm_exclude_from_sitemap_by_term_ids', 'sm_exclude_from_sitemap_by_term_ids' );

Exclude an author

/**
 * Excludes author with ID of 5 from author sitemaps
 *
 * @param array $users Array of User objects to filter through.
 *
 * @return array The remaining authors.
 */

function sm_sitemap_exclude_author( $users ) {
    return array_filter( $users, function( $user ) {
         if ( $user->ID === '1' ) {
             return false;
         }
 
         return true;
     } );
 }
 add_filter( 'sm_sitemap_exclude_author', 'sm_sitemap_exclude_author' );

Exclude a taxonomy

/**
 * Exclude a post type from sitemaps
 *
 * @return array The slugs of post types to exclude.
 */

function sm_sitemap_exclude_taxonomy() {
    return [ 'news-category', 'genre' ];
}
add_filter( 'sm_sitemap_exclude_taxonomy', 'sm_sitemap_exclude_taxonomy' );

Include a taxonomy

**
* Include empty taxonomies in XML sitemaps
*
* @return bool The bool variable for taxonomy filter.
*/
function sm_sitemap_taxonomy_hide_empty() {
    return false;
}
add_filter( 'sm_sitemap_taxonomy_hide_empty', 'sm_sitemap_taxonomy_hide_empty' );

Exclude a post type

/**
 * Exclude a post type from sitemaps
 *
 * @return array The slugs of post types to exclude.
 */

function sm_sitemap_exclude_post_types() {
    return [ 'news' ];
}
add_filter( 'sm_sitemap_exclude_post_types', 'sm_sitemap_exclude_post_types' );

Exclude specific posts

/**
 * Exclude posts from sitemaps.
 *
 * @return array The IDs of posts to exclude.
 */

function exclude_posts_from_xml_sitemaps() {
    return [ 13 ];
}
add_filter( 'sm_exclude_from_sitemap_by_post_ids', 'exclude_posts_from_xml_sitemaps' );

Other features

Move your sitemap to your domain root

If your WordPress site is installed in a subdirectory, such as example.com/blog/, your sitemap will be generated at example.com/blog/sitemap.xml by default. However, if you want to move your sitemap to example.com/sitemap.xml, you can do so by entering “http://example.com/blog/” in the “Override the base URL of the sitemap” field on the plugin settings page.

Once you’ve done that, you’ll also need to add a rewrite rule to your .htaccess file in the root directory of your domain. Here’s the code for the rewrite rule:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^sitemap(-+([a-zA-Z0-9_-]+))?\.xml(\.gz)?$ /your-blogdir/sitemap$1.xml$2 [L]
</IfModule>

Make sure to replace “your-blogdir” with the actual name of the subdirectory where your WordPress site is installed. This rewrite rule will redirect requests for the sitemap from example.com/sitemap.xml to example.com/blog/sitemap.xml.

Common Problems

Google Webmaster Tools Shows 0 Indexed Pages

It is not uncommon to encounter an issue where Google Webmaster Tools displays some pages of your sitemap as “submitted” but not “indexed” or where the number of indexed pages is lower than the submitted ones. In such a scenario, please ensure that:

  • You have verified the correct website. Google distinguishes between HTTP or HTTPS, www/non-www, and root/subfolder. So, if your blog runs on http://www.blog.com/, make sure to add http://www.blog.com/ to Google Webmaster Tools and http://www.blog.com/sitemap.xml as your sitemap. If you add http://blog.com/ or https://www.blog.com/, you will NOT see indexed pages.
  • Your sitemap does not have any errors. Your sitemap might contain warnings, such as if your website was loading slowly when Google crawled it. This is not an issue with your sitemap.
  • If your sitemap contains links to unavailable pages, try to find them in WordPress and check which plugins you have installed. The plugin reads all posts from the post table that are published and do not have a password. If something appears in your sitemap, it is in your WordPress database.
  • Lastly, note that the statistics in Google Webmaster Tools are NOT real-time. They are for information purposes only. Use the “site:” operator in Google Search to determine which pages of your blog are currently indexed. It may take a few hours or even days for new URLs to appear as indexed in Google Webmaster Tools, but they are already included in the search results.

Google Webmaster Tools reports “Missing XML tag”

Your sitemap is not being read by Google because there is no content. Here are some steps you can take to fix the issue:

  • First, check the sitemap to see if there are any URLs inside. If not, then there is no content for Google to index.
  • If the problematic sitemap is sitemap-externals.xml, check to see if you have added any external pages. If so, make sure the URL for each of them is correct, and there are no empty lines in the “Additional pages” section of the plugin settings page. You can also try saving all settings again using the “Update Options” button at the end of the page.
  • If the problematic sitemap is sitemap-archives.xml and you don’t have any posts, only pages, you can resolve the issue by disabling the “Include archives” setting under “Sitemap content.” This will prevent the sitemap from including archives and only include the pages you want to be indexed.

Google Webmaster Tools reports “Invalid XML” or my Browser says, “error on line XX at column 6: XML declaration allowed only at the start of the document”

To troubleshoot an issue with your sitemap or RSS feeds not working, you can check if there is a blank line or whitespace in front of the XML tag by opening your sitemap in your browser and choosing “View Source.” If there is, it could be due to a new line or whitespace in another plugin or the functions.php file of your theme. Ensure that the functions.php file ends with ?> and that there is no blank line or whitespace after it. If that does not work, try disabling other plugins one by one to find the problematic one. As a temporary fix, you can also try using this whitespace fix.

Google Webmaster Tools reports, “Sitemap is in HTML format”

It is not uncommon to encounter this issue where Google Webmaster Tools displays an error message. This problem typically arises from a glitch in the tool itself, and the best course of action is to resubmit your sitemap and wait. The issue usually resolves on its own after some time.

Google Webmaster Tools reports “404 Not found” for the sitemap

  • Check your permalinks settings of WordPress and click the Save button there. For Apache and Apache-compatible web servers, at least the following rewrite rule is required in your document root:
    RewriteRule ^index\.php$ - [L]
  • If you are using nginx as a web server, add the Rewrite Rules manually. The rules should be displayed on the plugin’s settings page.