Understand what a sitemap is and how to create one for your site – WAU

Sitemap is a file that explains to search engines, like Google, the structure of a website, as well as its updates and the main pages of the website to be indexed. It is usually created and used in XML format.

It was the search engines that imposed order on the immensity of internet addresses and they are also the main responsible for taking the public of your business to your company’s website, store or blog.

Its tireless crawlers scan the internet day and night in search of new content to compose its disputed search rankings. What you may not know is that you can lend a hand to these robots simply by making a small file available on your server: sitemap!

Check out the topics we’ve prepared for this article:

If you are already familiar with Digital Marketing, you should understand the importance of having a light, responsive and optimized website in accordance with good SEO practices. Sitemaps are included in this subject, however, their focus is on indexing pages by search engines.

Continue reading to understand the importance of these “maps” and learn how to create yours!

What is a sitemap?

The sitemap is a text file or document developed to facilitate the process of indexing pages in search engines, such as Yahoo !, Bing and, of course, Google.

The crawlers, or crawlers, as they are also called, have been improved over time, but the density of content on the internet has also increased dramatically, making the work of these robots difficult.

In general, search engines discover new pages from links present in those that have already been indexed. Sitemaps, in turn, complement this scan by listing all the URLs of a website linked to metadata that inform when it was created or modified, and how often it is updated, as well as its language and its relevance in relation to other pages.

In media files, such as videos, the execution time, its category and even the age range of the content are also informed. In the images, information such as file type, resolution, subject and usage licenses are attached.

Generally, sitemaps can be accessed by including “/sitemap.xml” right after the website address – the Google sitemap, for example, is available at https://www.google.com/sitemap.xml.

Below you can see an example of a basic sitemap that tells you the location of a single address.



   
      ENDEREÇO-DO-SITE
      2005-01-01
      monthly
      0.8
   
 

If you’ve never come across a file like this before, you probably think that this sitemap is just a bunch of meaningless symbols and URLs. However, if you pay attention to some excerpts, you will notice that the document contains several important information such as:

  • “”: URL location;
  • “”: date of last modification;
  • “”: frequency of updates;
  • “”: relevance level.

Summing it all up in other words, we can say that the sitemap simply sorts and “translates” the data on your site into the language of the search engines, and also brings it all together in a readable document.

Sitemap or sitemap: what are we talking about, anyway?

When searching for sitemap or sitemap, some seemingly disconnected content can end up confusing you. What happens is that the term has two different meanings within the same context – in this case, websites.

Many pages that appear in the results treat the concept as the hierarchical representation of the structure of a website. This type of sitemap is a document generally used by developers and web designers to define the order and navigation of pages in a project. Look how it looks.

In addition to creating websites, this representation in information architecture is very useful in the development of software, applications, SaaS platforms and, also, in planning strategies and improvements in UX (User Experience).

As you can see, this is not the subject of our article. Although it also contains information about the structure and relevance of the pages, the “site map” we are dealing with here is the digital document whose main function is to clarify the content of the sites for the search tools.

Essentially, there is no difference between the two forms, sitemap or sitemap, so the use of the word can only be clarified in its context. Meantime, when referring to files with metadata created for the search engine indexing service, most of the available content prioritizes the term sitemap. Don’t get confused!

When and why to create a sitemap?

When the pages of a website are properly linked, web crawlers, such as Googlebot, are able to quickly detect most of the published content. Sitemaps, therefore, play a more critical role in sites that work with less usual files or make large amounts of data available.

Very distant pages – which require a long link to be covered in order to be accessed – can take a long time to be identified. Likewise, specific content based on less conventional languages, such as AJAX, can be misunderstood by search engines.

Another important point concerns updates. Some very active news portals and blogs generate so much daily information that their publications are unlikely to be instantly indexed with just the effort of crawlers.

Therefore, we can list some criteria that indicate the greatest need to use sitemaps. Are they:

  • very large site: when there are too many pages to be indexed, it is possible for robots to ignore some addresses and updates;
  • recent site: when the site is new and has few external links, its discovery by the crawlers can take more time;
  • isolated pages: when the site has a poor link building, some categories and pages may be lost in indexing;
  • site with rich media content: sites that want to be ranked in specific search ranking categories, such as Google News or Google Shopping.

Although your site or blog does not fit exactly in the mentioned criteria, the sitemap can contribute to your crawl by clarifying for search engines the pages you consider most relevant, in addition to preventing important parts of the content from being lost in the scans.

However, make no mistake. Sitemaps do not guarantee that all items on a site will be crawled and indexed, much less contain commands capable of causing specific pages to be incorporated immediately – they are just facilitators.

The quality of the crawl also depends on parameters defined by the algorithms of each search platform. Sitemaps, therefore, simply facilitate the interpretation of a website’s content and updates, making the indexing process more efficient.

What are the sitemap formats, types and standards?

In 2006, Google, Yahoo! and Microsoft signed an agreement setting a standard for the creation of sitemaps. The purpose was to facilitate the indexing of the sites, regardless of the search platform used by the user.

The result was a worthy win-win relationship. In addition to helping webmasters to insert their website pages into search engines, the initiative has also made crawling websites more efficient, enriching search results for all tools.

The guidelines of the defined standard are available on the portal sitemaps.org and can be applied in different file formats. There are also extensions that allow the indexing of pages in specific categories of search engines such as the videos, images or news tabs. Let’s discuss a little more about all of this now.

The main sitemap formats

The most famous search engine today, Google, accepts sitemaps in several formats, such as RSS, mRSS, Atom 1.0 and TXT. However, the most efficient and widespread standard is XML.

The simplest of all, without a doubt, is the TXT, which is nothing more than a text file in which the URLs of a website are listed. The RSS, mRSS and Atom formats are well known due to the classic news feeds, however, they are limited by being able to report only recent posts from the sites.

XML sitemaps, in turn, are more efficient in every way. In addition to listing and classifying URLs based on several different criteria, they also communicate additional media information to search engines that can be very important, especially if the focus of your publications is on audiovisual products or content.

The main extensions or types of sitemaps

It is possible to create sitemaps especially for your video, image, news or e-commerce publications. The search engines have specific sections for these contents and you can inform them that you want to rank your content in some of them.

In this case, you will need to create an additional sitemap for the category in question only. The only difference of this new file is the extensions that allow the incorporation of complementary data, such as the length of a video, the authorship of an image or the price of a product.

As for the News category, from Google, it is worth mentioning that, before sending a sitemap with the recommended extensions, it will be necessary to register with the Google News Publisher Center.

The main standards defined by the search engines

There are some limits set for sitemaps. Files that exceed 50 MB (uncompressed) or have more than 50,000 URLs listed are not accepted by Google. Of course, except for large portals and online stores, these numbers are much higher than what a regular website is capable of generating.

However, if this is the case, it is possible to “split” your site map into several sitemaps. You can, for example, create maps by categories or themes, enriching indexing and facilitating the identification of problems in the process.

Another important point is that, when working with several different sitemaps, it is recommended to create an index file indicating the path to the other maps, that is, a sitemap for your sitemaps. Google’s Search Console help provides a specific topic explaining how to split large sitemaps.

How to create a sitemap?

Okay, it’s time to get your hands dirty and create a sitemap from scratch! There are several ways to do this, either manually or with the help of tools, and in general, the process is relatively simple. Check out!

1. Define the URLs that will be listed

The first thing to do before we go into the creation process is to define which pages Google will crawl. At first you can imagine that the ideal is to attach all the addresses of your site, but the pages that contain terms of use or manuals of use, for example, may not be very interesting in the search results.

2. Define the formats and extensions that will be used

As described, you can create a sitemap in different formats, with XML being the most recommended. When creating it manually, just use a simple text editor like Windows Notepad and save the file in XML version with UTF-8 encoding.

3. Create your sitemap

Option 1: manually

All protocols and tag definitions are available on the sitemaps.org portal. The main obligations described are:

  • start the document with the opening tag “”And end with the closing tag“”, Without the quotation marks;
  • specify the standard protocol in the “”;
  • in each URL added, include a tag “”As a parent tag and a child entry“”.

If you happen to be using extensions to improve the indexing of videos, images or news, you will need to follow the guidelines that each search tool provides. Check out Google’s help topics for each type of content:

If you don’t know how to create this type of file, don’t worry. There are many tools that can help you create a sitemap in a much easier way.

Option 2: with tools

With some practical tools, we can create a sitemap with a few clicks. Check out some options below.

Standalone tools – Google provided its own sitemap generator, but unfortunately the project was discontinued.

However, there are still some interesting alternatives like GsiteCrawler, which simulates search engine robots and automatically creates a sitemap, and XML-Sitemaps, which creates free maps for sites with up to 500 pages.

WordPress Tools – if your site runs from the most beloved CMS on the planet, WordPress, you will have no difficulty creating your sitemaps.

Basic plugins like Better WordPress Google XML Sitemaps or complete SEO tools like Yoast SEO, solve everything for you in a practical and intuitive way. Our blog’s sitemaps are actually generated by Yoast SEO. You can check one out here!

What do I do with my sitemap?

In principle, the recommendation is that your sitemaps be uploaded to your server, and you can do this manually through your hosting platform’s cPanel or using an FTP server like FileZilla.

The process couldn’t be simpler: just upload the file in the same subfolder as the URLs listed there. Just don’t think this is enough.

Bearing in mind that Google has the supremacy of search engines, it is interesting to provide some more adjustments thinking exclusively on that platform. For that, we have Search Console, Google’s exclusive tool that helps its users improve their SEO results.

Integrate Yoast SEO with Search Console

If you have a WordPress site with the Yoast SEO plugin installed, again, everything will be solved in very few steps. The good news is that Yoast can be easily integrated with Google Search Console.

Right after installing the plugin, you will receive a notification to perform the initial SEO settings for your site, including creating a sitemap and integrating it with Search Console. You can also access the Yoast menu in the sidebar and click on the link “Configuration wizard” inside the “First-time SEO configuration” bar.

You will be taken to a step by step in which you must inform:

  • section 1: if you want to allow indexing of your website (if it is active or under construction);
  • section 2: what type of website is yours (blog, portfolio, news channel etc.);
  • section 3: whether the website is personal or corporate;
  • section 4: if you want posts, pages or both to be displayed in search results;
  • section 5: if the website content is created by a single person or are multiple authors;
  • section 6: integration with Search Console (we will see below);
  • section 7: appearance of the site title;

Sections 8 and 9 are just invitations to subscribe to the Yoast SEO newsletter and upgrade to its Premium services.

When integrating with Search Console in section 6, you will need to register on the platform. If you don’t have it, just do it quickly on the Search Console website using a Google account.

That done, just generate an authentication code by clicking on the link “Get Google Authorization Code”.

Once generated, just copy the code, paste it into the bar indicated and click “Authenticate”.

Submit a sitemap to Google Search Console

If you use other platforms to manage your website, such as Wix, Joomla or Drupal, the process remains simple. Just follow the guidelines below.

  1. First, log in to Google on Search Console – if this is your first access, you will need to follow a few steps to get to the panel;
  2. Select your site on the panel;
  3. click in “Sitemap” in the sidebar;
  4. Add the sitemap URL (this is usually the domain followed by “/Sitemap.xml” or “/Sitemap_index.xml”);
  5. Click to send.

What are the benefits of sitemaps in terms of SEO?

The use of sitemaps, by itself, already makes it clear to the search engines that the webmaster is willing to collaborate with the scanning of their crawlers, which gives the site a more professional tone.

In terms of SEO, in addition to facilitating the indexing of pages, sitemaps inform updates of posted content, allowing them to be reread by crawlers more efficiently. Whether you regularly update your blog posts or recycle well-ranked URLs, the benefits are even greater.

It is worth mentioning that sitemaps are linked to Google Search Console which, among many other functions, helps the user to optimize their SEO actions.

How can I improve my results even more?

We can quote some final recommendations for sitemaps to continue helping your website, blog or online store to rank well in searches:

  • when creating your files, give preference to Canonical URLs (the best address available for the page or the most used);
  • even though the URL limit on a sitemap is 50,000, it is recommended not to exceed 10 thousand to maintain the efficiency of indexing processes;
  • avoid frequent changes in transfer protocols (the popular https, for example), hire these services in advance and keep them.

FAQ (frequently asked questions)

How do I delete a sitemap from my site?

If for any reason – error, update or project closure – you want to delete a sitemap from your site, you can do so as follows:

  1. access your hosting files through cPanel or through an FTP server;
  2. locate the file (sitemap) in the URLs folder;
  3. delete it.

How do I delete a Search Console sitemap?

You can delete a sitemap from Search Console, but that will not prevent Google from recognizing your site’s sitemap and its URLs. To proceed, do the following:

  1. access your user panel in Search Console;
  2. select the “sitemap” menu;
  3. locate the sitemap you want to delete and click on “more options” (symbol with three dots);
  4. select the “Remove sitemap” option.

My website has millions of URLs, which ones should I enter in my sitemaps?

Enter only the URLs that are most frequently changed on a small number of sitemaps and identify them in the index file using the “lastmod” tag. In this way, search engines will incrementally index only updated sitemaps.

What to do with dynamic pages?

XML sitemaps help sites with a more complex structure – usually e-commerce – to gain instant indexing for dynamic pages.

Can I compress my sitemap?

Yes. To compress it, use gzip. Keep in mind that sitemaps must be no larger than 50 MB, whether they are compressed or not.

Does the order of the URLs in the listing influence indexing?

No. The order of the listed URLs is not a relevant factor for indexing.

Finally, if you manage your sitemaps manually, remember to update them whenever you post new content to your site. For those who rely on the help of plugins like Yoast SEO, this task is not necessary, but it is essential to visit Search Console frequently to check the performance of your SEO strategy.

Did you like the article and want to know more about it? So stay with us and check out our definitive guide to Google Search Console now!