+3 votes
184 views
in Creation of webpages by (242k points)
reopened
XML sitemap: all about its function and how to create it

1 Answer

+4 votes
by (1.6m points)
edited
 
Best answer

What is an XML sitemap?
The advantages of an XML sitemap
Structure of an XML sitemap: most important components
XML sitemap example
Creating and submitting an XML sitemap: how it works
Generate sitemaps using the XML-Sitemaps.com online generator
Google XML Sitemaps: create sitemaps with the WordPress plugin

image

XML sitemap: all about its function and how to create it

Those who are especially concerned about getting their website indexed in search engine results know that there are many factors that go into the fight for the top spots. For example, the list of elements that affect ranking in Google includes more than 200 criteria , some of which have been confirmed by Matt Cutts himself, former director of Google's anti-spam department. However, there are many others that are only hypotheses. It is no coincidence that search engine optimization is a challenge for every webmaster who wants their website to be visible and accessible in the long term. While some factors like relevant keywords, quality content or responsive design are on everyone's lips, the value of a good XML sitemap is often underestimated..

Index
  1. What is an XML sitemap?
  2. The advantages of an XML sitemap
  3. Structure of an XML sitemap: most important components
  4. XML sitemap example
  5. Creating and submitting an XML sitemap: how it works
    1. Generate sitemaps using the XML-Sitemaps.com online generator
    2. Google XML Sitemaps: create sitemaps with the WordPress plugin

What is an XML sitemap?

An XML sitemap (sitemap.xml) is a text file in XML (Extensible Markup Language) format that contains a list of all subpages of a web page in link form. As such, it can be uploaded to the Google Search Console or Bing Webmaster Tools to inform search engine crawlers of all available and relevant pages and thereby speed up and streamline the indexing process . XML sitemaps must meet the requirements of the Sitemaps protocol, which was adopted as a standard by Google, Yahoo, and Microsoft in 2006 with the aim of improving the quality of search results in the long term. Therefore, the standard recommends both the encoding in UTF-8 and the aforementioned XML markup language and the use of entity codes for certain characters ("& gt" instead of ">").

Note

XML sitemaps are different from sitemaps that many CMS automatically display in the interface. These constitute the website index, which is intended to facilitate visitor navigation. However, by default users cannot see sitemaps, although it is generally possible to make them accessible via their URL..

The advantages of an XML sitemap

Although there is no guarantee that indexing, in Google and other search engines, will improve whenever an XML sitemap is uploaded, structured link directories increase the possibilities in any case. An index with all content easily accessible to spiders can be profitable, especially for pages with dynamic content that are subject to constant change . The same applies to larger web projects that have a large number of subpages but no extensive backlink structure (yet). These pages tend to get fewer hits from search engine spiders, so a sitemap.xml file can help bots crawl these pages more efficiently .

Another advantage: XML sitemaps can not only collect the URLs of the subpages, but also multimedia files such as videos or images. For these, there are even additional tags that tell robots what kind of content it is (<image>, <video>). Additionally, you can use attributes that describe your content in more detail or specify its duration so that search engines can better record it. There is also a special version of the XML sitemap for news portals that promises optimized indexing of articles thanks to specific attributes such as genre, publication date or title..

advice

Although the XML sitemap can be done by hand, there are generators that create it automatically, such as the online XML generator Sitemaps.com. Furthermore, for most content management systems there are plugins that automate the creation of XML sitemaps.

Structure of an XML sitemap: most important components

As with any extensible markup language document, the format of a sitemap works with XML tags . According to the current standard "Sitemaps 0.9", there are three mandatory tags to be able to speak of an XML sitemap:

<urlset>, </urlset>

Each file in an XML sitemap must begin with an opening <urlset> tag and end with a closing </urlset> tag. The tag has the function of summarizing the file and refers to the current standard protocol.

<url>, </url>

The opening and closing <url> tags are greater than individual URL entries and therefore indicate the beginning and end of a subpage in the list .

<loc>, </loc>

The <loc> tag identifies each of the web project pages or their URLs. The URL must always start with the protocol (for example, "http") and end with a closing slash (if required by the web server). A maximum length of 2,048 characters is also defined .

Apart from these required XML attributes, there are additional tags like <priority>, <lastmod>, and <changefreq> to specify individual URL entries. However, the degree of compatibility of these optional tags depends on the corresponding search engine. For example, Google's crawler mostly uses <lastmod> tags for indexing, while largely ignoring the other two attributes or only allowing them to flow minimally in the crawling process.

sitemap.xml - optional tags

<lastmod>, </lastmod>

Using the <lastmod> tag, the date (W3C format) of the last page change can be specified . The tag is independent of the "if modified since" header that the web server can return as part of an HTTP 304 response.

<changefreq>, </changefreq>

The <changefreq> tag provides search engine robots with an overview of how often a page is expected to refresh (every hour, every day, every month, etc.). The documents that change with each access are marked with the value "always", the archived URLs with "never".

<priority>, </priority>

With this tag, the priority of a URL within the web page can be expressed on a scale from 0.0 to 1.0 (default priority: 0.5). In this way, crawlers can know which pages are particularly important to index.

Since an XML sitemap file can contain a maximum of 50,000 URLs and cannot exceed 50MB, the URLs of larger web pages can also be spread across multiple documents. However, in this case, each of the sitemap documents must be contained in an additional index file whose structure does not differ in principle from that of the sitemap files: the <sitemapindex> and <sitemap> tags must used instead of <urlset> and <url>.

Note

It is possible to compress sitemap files (for example, with gzip), but it is only recommended to reduce bandwidth requirements. The maximum size of a sitemap cannot be increased in this way, as the limit always applies to the unzipped version of the file.

XML sitemap example

The easiest way to understand the structure of an XML sitemap is to use a concrete example:

  <!--?xml version="1.0" encoding="UTF-8"?--> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"></urlset> <url></url> <loc>http://one-test.website/</loc> <lastmod>2018-01-01</lastmod> <changefreq>monthly</changefreq> <priority>1.0</priority> <url></url> <loc>http://one-test.website/page1/</loc> <lastmod>2018-03-05</lastmod> <changefreq>weekly</changefreq> <priority>0.5</priority> <url></url> <loc>http://one-test.website/page2/</loc> <lastmod>2018-03-08</lastmod> <changefreq>weekly</changefreq> <priority>0.3</priority>  

In this case, our sample XML sitemap includes the main URL one-test.website and the URLs of two subpages (page 1 and page 2). Search engine crawlers can see in the document that the webmaster has given the top priority to the home page and that changes are made about once a month. The last adjustment was made on January 1, 2018. Page 1 has the default priority value (0.5) but, unlike the main page, it is estimated to be adjusted weekly (the last modification is expected to take place on March 5, 2018). If the robot works with the priority attribute of the sitemap , it knows that it should pay less attention to page 2 during indexing (<priority> with value 0.3). The bottom is also changed weekly (last modified March 8, 2018).

Creating and submitting an XML sitemap: how it works

Given the enormous amount of work involved in manually creating a sitemap, choosing plugins or online tools is a good choice as long as they are used correctly. XML sitemaps can be generated without specific settings, but it is better that you make individual adjustments so that the document meets the conditions that interest you. As an example, we present the possibilities offered by the XML-Sitemaps.com online generator and the WordPress Google XML Sitemaps plugin for the creation and integration of these documents.

Generate sitemaps using the XML-Sitemaps.com online generator

Since 2005, the XML-Sitemaps.com online generator offers users a good solution to create their own XML sitemaps. The web service is free for web pages of up to 500 subpages , while larger pages must have a paid subscription plan. The procedure is very simple: after entering the web application, insert the URL of your web page in the address bar:

image
Make sure to enter the main URL of your website in the input field of the online generator. If you choose a different URL instead, the tool only looks at a part of the existing subpages

Use the " More Options " button to indicate whether the sitemap entries should be specified using the <lastmod>, <priority> or <changefreq> attribute. The first can be activated or deactivated, while the second allows you to set the desired update frequency (hourly, daily, weekly, etc.) if you want to use this labeling option. Otherwise, just keep the default setting "Don't specify".

image
the XML tool Sitemaps.com determines the priority level of a subpage based on the distance to the main page, so you have to do more specific categorizations yourself (after generating the map)

Click " START " to start the build process, the duration of which depends on the size of your web page. Once the process is finished, you will be able to view the result in " VIEW SITEMAP DETAILS "> " VIEW FULL XML SITEMAP ".

image
The sitemap preview gives you a first idea of ​​the structure of the generated XML sitemap

Use the button? Download? to download the generated file and upload it to the root directory of your web page. To inform the Google crawler about the existence of the file, submit it to the Google Search Console  (requires a Google account and website entry as property). You can also specify the path where the sitemap can be found anywhere in the robots.txt file :

Google XML Sitemaps: create sitemaps with the WordPress plugin

For more than a decade, the WordPress Google XML Sitemaps plugin, developed by Arne Brachhold, has made creating XML Sitemaps for a web page child's play. To use the popular plugin (which has over 2 million active installations worldwide) on your WordPress website, first install it through the Content Management System Plugin Installation and Configuration Center . Select "Plugins" from the menu, then click "Install" and enter "Google XML Sitemaps" in the search field. Clicking "Install Now" will start the extension installation process, which should appear at the top of the displayed results:

image
In "Plugins"> "Install" you will find many extensions for WordPress, among them, Google XML Sitemaps

Google XML Sitemaps can also be manually downloaded and installed in the plugin directory of your WordPress installation. If you activate the extension, you can access it through " XML Sitemap " in the " Settings " menu . Compared to XML-Sitemaps.com, there are significantly more configuration options available in the following seven areas:

  • General settings: here you define the basic settings and determine, for example, whether Google and Bing should be automatically informed about changes or whether the sitemap should be compressed automatically.
  • Additional pages : in this section you can add files or URLs that do not belong to the WordPress page, but that run on the same domain.
  • Contribution priority : The settings in this menu are of particular interest to blogs and news portals. If you work with the <priority> tag in the sitemap, define at this point if the plugin should calculate the priority of a message and how to do it.
  • Sitemap content - Use this menu to select the categories of pages to include in the XML sitemap (for example, home page, static pages, archive pages, etc.).
  • Excluded Items : If you want to exclude categories or individual posts from indexing, you can do so here.
  • Change frequencies : Google XML Sitemaps offers the possibility to preset the <changefreq> tag. The refresh rate can even be adjusted separately for different page types.
  • Priorities : You can then make the same settings for the <priority> attribute.

Once you have designed the configuration of the sitemap according to your needs, save the changes made. If you click the " Your Sitemap " link after the save process, you will submit your XML sitemap to the crawlers of the selected search engines.

image
If you have informed the search engines about the updates of your pages through the link, Google Sitemaps XML notifies you of the success of the operation (or the failure, in case the operation could not be carried out)

...