How To Quickly Troubleshoot and Fix “Crawled – Currently Not Indexed” Issues in Google Search Console
How To Quickly Troubleshoot and Fix “Crawled – Currently Not Indexed” Issues in Google Search Console
Introduction
Seeing the status “Crawled – Currently Not Indexed” for your web pages in Google Search Console can be frustrating. It means that Google has crawled and discovered your pages, but has not added them to its search index for some reason. This means the pages won't show up in search results, even if they are highly relevant to someone's search query.
The good news is that in most cases, this issue can be fixed by making some adjustments to your site. Here are the main reasons pages get crawled but not indexed, and what you can do about it:
1. New or Recently Changed Pages
When you first launch a new website, add new pages, or make significant content changes to existing pages, it takes some time for Google to re-crawl and re-index those pages. So seeing “Crawled – Currently Not Indexed” is expected during the first few weeks after making major site changes.
The fix is simple – be patient! It typically takes 1-2 weeks for Google to fully recrawl and update the index status of new or changed pages. Keep creating high-quality content and optimizing your site, and eventually, those pages will get indexed.
To speed up the process a bit, you can:
- Submit new or changed URLs in Google Search Console for crawling.
- Add XML sitemaps to help Google discover new content faster.
- Use internal linking to pass “link juice” to new pages.
But other than that, give Google time to recrawl and include the new pages in its index.
2. Technical SEO Issues
Sometimes Google wants to index your pages but is prevented from doing so due to technical barriers. Here are some common technical SEO problems that lead to crawling without indexing:
NoIndex Meta Tags or Headers
Using noindex tags or X-Robots-Tag headers tells Google not to index a page. Make sure you remove noindex directives from any pages you want indexed.
Blocked by Robots.txt
If your robots.txt file is blocking Googlebot from crawling a page, it cannot be indexed. Check your robots.txt configuration and make sure you are not unintentionally blocking access.
Duplicate Content
If Google detects that a page is identical or nearly identical to content found elsewhere on the web, it may choose not to index that page to avoid duplicate content issues. Eliminate duplicate content problems by creating unique page content.
JavaScript Rendering Issues
If Google can't process JavaScript to render page content, that content remains invisible to its algorithms. Test that search engines can access JavaScript-generated content.
Problematic Structured Data
Invalid structured data markup can interfere with Google's ability to parse page content and context. Use Google's Structured Data Testing Tool to identify and fix issues.
4xx or 5xx Status Codes
Any HTTP response code above 400 indicates a server error that prevents Google from accessing the page content. Solve any configuration issues triggering 4xx or 5xx codes.
Fixing technical SEO barriers like these will remove roadblocks to Google indexing your pages properly.
3. Quality Issues
In some cases, Google crawls a page but chooses not to add it to the index due to concerns over page quality. Here are some quality signals Google uses when deciding whether to index pages:
Thin Content
Pages with very little text content or excessive keyword stuffing tend to get crawled but not indexed. Add unique, robust content to thin pages.
Low-Value Content
Pages that offer little value to users, such as redundant category pages or contact pages with just an address, often don't get indexed. Either enhance or remove low-value pages.
Spammy Content
Pages with lots of ads, affiliate links, or other “spammy” content are less likely to get indexed. Provide high-quality content written for humans, not search engines.
Poor Expertise/Authority/Trust (EAT)
Google wants to index pages with high EAT signals relating to expertise, author authority, and website trustworthiness. Improve site reputation and build subject-matter expertise.
Negative User Engagement Signals
Excessively low dwell time, high bounce rate, and few backlinks/social shares indicate poor user experience. Improve page quality to boost engagement metrics.
By identifying and improving any thin, low-value, or spammy content, while building your site's reputation and expertise, you can convince Google your pages deserve to be indexed based on their quality.
4. Site Architecture Issues
Sometimes a site's information architecture makes it difficult for Google to access or parse all of its content. Here are some common site architecture problems that can lead to crawling without indexing:
Problematic Sitemaps
Incorrect sitemap protocols, invalid or broken links, or excessively large sitemaps can cause indexing issues. Validate and optimize your sitemaps.
Crawl Budget Limitations
Large or complex sites with millions of pages may hit Google's crawl budget limits, preventing full indexing. Use sitemaps to prioritize important pages for indexing.
Pagination Problems
If Google can't access paginated content beyond page 1, those pages will remain uncrawled and unindexed. Make sure the pagination is crawlable and indexable.
Broken site navigation, malformed URLs, excessive use of #fragments and other misleading linking can obscure pages from Google. Audit site architecture and links for crawlability.
By optimizing your XML sitemaps, simplifying site architecture, making pagination crawlable, and cleaning up your internal linking scheme, you can uncover previously hidden content and get more pages indexed.
5. Indexation Time Lag
In rare cases, pages show as “Crawled – Currently Not Indexed” due to a time lag between crawling and indexation. Some reasons for this lag:
Server Issues
Problems on Google's end can cause delays in adding already crawled pages to the index. Usually resolves itself within a few weeks.
Individual Page Exclusion
Google may intentionally exclude certain low-quality pages from the index for an extended period after crawling
Indexation Queue Backlog
Huge influxes of new pages on extensive sites can cause queue backup. Pages get indexed eventually.
Core Algorithm Updates
Around the time of major Google algorithm updates, indexation may be intentionally delayed as changes roll out.
For any of these reasons, recently crawled pages sometimes take longer to get indexed. There's not much you can do but wait it out! The pages should make it into the index within a month or two at the most.
Diagnosing Your “Crawled – Not Indexed” Pages
Now that you know the main reasons pages can get crawled but not indexed, how do you diagnose the specific issues affecting your own pages? Here are some tips:
- In Search Console, filter to show only uncrawled or non-indexed pages and look for patterns. Group pages by type, folder location, host, etc. Any similarities that stand out?
- Compare indexed vs unindexed pages. Look for differences in content length, quality, technical factors, etc. What might make Google see them differently?
- Set up analytics tracking and segment by indexed vs non-indexed pages. Compare engagement metrics. Do the uncrawled pages see unusually high bounce rates?
- Search for your uncrawled URL verbatim in Google. Does it rank for that exact query? If so, it's indexed but not associated properly with your property in the Search Console.
- Try fetching the page as Google via the “Fetch as Google” option in the Search Console. See if any crawl errors surface.
- Speak to your developer. Have them review technical factors like status codes, metadata, script rendering, etc.
- Request a Search Console indexing audit from your SEO to spot architecture and quality issues.
By comparing your indexed to non-indexed pages and digging into metrics, crawl data, and website source code, you can usually determine why Google has not added those pages to its index so you can rectify the issue.
Best Practices to Get and Keep Pages Indexed
Here are some overall best practices to help you get more of your site indexed in Google search:
- Create unique, high-quality content that offers value for users. Write for humans, not bots!
- Design and build your site to be easily crawlable by search engines: fast load times, simple architecture, and seamless pagination.
- Leverage XML sitemaps to help Google discover new content quickly and efficiently.
- Implement good technical SEO across the site: clean code, proper use of tags and headers, valid structured data, etc.
- Build high-quality, relevant links from reputable external sites to boost page authority.
- Resolve indexing issues quickly when they appear in the Search Console to maintain maximum visibility.
- Submit important new or changed pages to Google through Search Console for speedier indexing.
- Stay patient right after launching new pages or making significant site changes, and give Google time to recrawl and update the index.
Getting your pages indexed in Google is a crucial SEO goal. By optimizing content and site architecture for both users and search engines, addressing technical barriers, improving page authority and reputation, and keeping on top of Search Console data, you can maximize your site's presence in Google's search results.
Conclusion
Seeing “Crawled – Currently Not Indexed” in the Search Console is very common, but it's not an issue you should ignore. Unindexed pages mean lost opportunities – lost chances for your site to rank for relevant searches, drive traffic, and grow your business.
By understanding the potential reasons pages get crawled but not indexed, and following both general and page-specific best practices, you can successfully get your content included in Google's search results.
Some key takeaways:
- Be patient right after launching new pages or making significant site changes. Give Google sufficient time to recrawl and re-index new and updated content.
- Diagnose and resolve any technical barriers like problematic meta tags, blocked pages, duplicate content issues, etc. that prevent Google from indexing pages.
- Improve thin, low-value, or spammy content to meet Google's quality standards for inclusion in its index.
- Make sure your XML sitemaps are optimized and your site architecture facilitates the discovery of all pages.
- For important new pages, use Search Console to request indexing and submit the URLs directly to Google.
- If pages show as indexed but not associated with your property, it's likely a tracking issue, not an indexing problem.
- For newly crawled pages stuck in limbo, wait a month or two for periodic Google recrawls and queue processing time.
- Leverage the invaluable data in the Search Console to monitor indexation status and uncover bottlenecks.
With persistence and dedication to optimize both technical and content factors on your site, you can successfully resolve Google indexing dilemmas. Employing best practices for launching new pages and monitoring Search Console data will help you achieve your search visibility goals in the long run.
The reward is well worth the effort – fuller indexing of your website so more pages can be discovered by Google and surfaced to relevant searchers. Removing barriers to entry in Google's index should be a top priority for website owners seeking to tap into search engine traffic.