The Ask an SMXpert series continues the question-and-answer (Q&A) segment held during sessions at SMX Advanced 2018 in Seattle.
Today’s Q&A is from the Advanced Technical SEO: Page Speed, Site Migrations and Crawlingsession with Frederic Dubut from Microsoft/Bing.
Question: If we have a bunch of new content we want to index but are worried we don’t have enough crawl budget, which should we do first: focus on increasing crawl or indexing the new content?
Frederic: If you are currently running into severe crawl issues, you should definitely work on stabilizing your crawl. Keep in mind the most severe crawl issues are generally due to duplicate content. This is something you should do regardless of whether you intend to publish a bunch of new content or not, although this is even more critical in the former case.
If you are not currently running into crawl issues, and the amount of new content is not disproportionate with the amount of existing content, then you can start with indexing the new content and monitor how fast the crawler is picking it up.
Publishing an up-to-date and comprehensive sitemap will help, and so will publishing a really simple syndication (RSS) feed linking to all new content posted in the past three hours. Remember to register both of them in the Bing Webmaster Tools (BWT) for the highest impact.
Question: Google says they tend to consider a 302 as a 301 over time. Is that the same for Bing?
In general, we strongly recommend using a hypertext transfer protocol (HTTP) 301 for permanent redirects and HTTP 302 for temporary redirects.
As a rule of thumb, if a uniform resource locator (URL) redirects to the same target for more than one day, it should probably be an HTTP 301 redirect.
If Bing sees your HTTP 302 redirect always points to the same URL, it may eventually consider it as a permanent redirect (i.e., an HTTP 301 redirect). However, this would take an undetermined amount of time, and there is no guarantee that this happens at all.
To keep full control over your indexing and ensure that signals are propagated properly, you should always use HTTP 301 for permanent redirects, especially in cases of large-scale migrations, which are already tricky enough.
Question: Assuming one is not concerned with performance degradation, is crawl budget a concern?
Frederic: If your site performance degrades, the Bing crawler will automatically throttle itself to preserve the state of your site. This would actually result in a reduction in crawl budget.
To the more general question of whether crawl budget should be a concern for you, if your site is relatively small and well optimized for search engine optimization (SEO), then it is probably not a concern. The larger your site, the more you need to think about crawl budget and how to meet crawl demand.
Question: What is the optimal number of URLs to have in sitemaps for Bing’s crawler? Is there a preferred sort order for the URLs or should they be randomized?
Frederic: The sort order for the URLs in your sitemap does not matter. After the Bing crawler downloads your sitemap, the URLs are extracted and joined with all the other signals we have already accumulated about them. The crawl queue is then prioritized based on the aggregated signals.
There is no optimal number of URLs either. You should list all the relevant URLs for your site in your sitemap and keep it updated at least once a day — it is as simple as that. Of course, you need to make sure you do not have duplicate content or bad URLs in your sitemap either.
Question: If you have an e-commerce website with many URL parameter product filter pages, each with canonicals to the root unfiltered listing page, does that waste crawl budget? If so, what would you do instead?
Frederic: In this specific situation, the most effective way to tell the crawler to focus on the canonicals (without query parameters) is to add these query parameters to the Ignore URL Parameters list in Bing Webmaster Tools. After you do that, the Bing crawler will essentially consider the URLs with and without these query parameters as equivalent, and it will focus its crawl on the ones without. You may still see some limited crawl volume on the URLs with query parameters for validation purposes.
You should also include all the canonicals in your sitemap and make sure that it does not contain any of these duplicate URLs with filter query parameters. By doing all of that, you will greatly mitigate any impact on crawl budget of having these many different URL variations.
Bing announced Bing AMP viewer and JSON-LD support in our BWT console during SMX Advanced. You can read about it here.[“Source-searchengineland”]