If site visitors can’t find your web pages, your content isn’t doing you — or them — any good. After investing in creating web pages, make sure they’re accessible to visitors and search engine crawlers, so they don’t end up as orphan pages.
What Are Orphan Pages?
An orphan page is a web page that doesn’t have any internal links pointing to it. Since web crawlers generally “find” content by crawling internal links, an orphaned page is undiscoverable unless it’s included in a sitemap or submitted directly to a search engine.
Without any links pointing to it, an orphaned page is inaccessible to site visitors. You can create an excellent piece of content geared toward conversions. However, if you don’t have links pointing to it, your site visitors won’t find it.
In some cases, orphan pages may link to other orphan pages and even attract some organic traffic. However, since they aren’t reachable from the rest of the site, they probably won’t reach their full ranking potential because they won’t benefit from link equity passed from internal links.
Why Do Orphan Pages Exist?
Orphan pages happen for a variety of reasons, including:
No linking strategy. Accidentally creating an orphaned page is quite simple. If your site menu doesn’t automatically include new pages, any page has the potential to be an orphan page unless you have an internal linking strategy in place.
Failure to remove or redirect. Some orphaned pages are the result of time-bound campaigns. Once the campaign has ended, links to the campaign landing page may be removed. However, if the landing page itself is not redirected or deleted, then it becomes an orphan page.
Site migration error. Some orphan pages may be “leftovers” from a previous site iteration, an oversight during a site migration.
By implementing internal linking and QA processes, you can reduce the chances of orphan pages and ensure that all of your pages can be crawled.
How Do Orphan Pages Impact SEO?
Because orphan pages are harder for Googlebot to find and crawl, they may not rank as well as they could.
Search engine crawlers rely on a site’s internal link structure or its XML sitemap to find all of its pages. If there’s no link to a particular page, they won’t be able to find it (unless it’s been directly submitted to Google). If the crawlers aren’t able to access the page, they can’t index it and serve it in search results.
Even if an orphan page appears in your XML sitemap, it can still cause problems. The sitemap may help the crawler find it, but without any internal links, the page won’t receive any link equity. Adding internal links from authoritative pages on the site can help share some of that link equity with the orphan page, which will often improve the page’s performance in search rankings.
Ultimately, if your site has many orphan pages, you should find and fix them to make the most of the time and effort you’ve invested in creating your web pages.
How To Find Orphan Pages With Screaming Frog
One way to identify orphan pages is with a crawler tool like Screaming Frog.
Screaming Frog can help you scan all of the crawlable pages on your website and identify orphan pages from three different sources: XML sitemaps, Google Analytics, and Google Search Console. I’m going to show you how to configure Screaming Frog to find orphan pages from these three sources. I recommend setting up all three to have the best chance of finding orphan pages. However, you can still run a crawl analysis even if you only opt to set up one or two of the sources. (Note: you’ll need a Screaming Frog license to do this.)
1 – Link XML Sitemaps
To add your sitemap, head to ‘Configuration’ in the top menu and select ‘Spider’ from the drop-down.
A settings box will open up on the ‘Crawl’ tab. Scroll down and paste the URL(s) of your sitemap(s) in the text box. Check the box next to ‘Crawl Linked XML Sitemaps’ and ‘Crawl These Sitemaps.’ Paste your sitemaps in the text box and click ‘OK.’
2- Connect to Google Analytics
Head to ‘Configuration’ → ‘API Access’ to connect to the Google Analytics API.
Click ‘Connect to New Account’ if you have not previously connected your Google Analytics account.
This will take you to a Google login page. Select the user account associated with your Google Analytics account to give Screaming Frog access to your GA data. You can easily remove this connection after the crawl analysis is complete.
The segment filter allows you to select the source of traffic you want to analyze. If you want to find orphan pages that show up during organic Google searches, select ‘Organic Traffic’ from under the segment section. If you want to detect orphan pages that may show up during paid searches, select ‘Paid Traffic,’ and if you want to see all traffic sources, select ‘All Users.’ Selecting ‘All Users’ will cast the widest net, so it’s generally the way to go if you’re trying to find orphan pages.
You can also set a date parameter from the ‘Date Range’ tab in the top menu. The default date range is 30 days, but you can increase this if you’d like. Setting a range of 12 months can help you find pages that are seasonal.
To add the new URLs discovered in Google Analytics to the crawl queue and make them viewable within the user interface, you need to check the ‘Crawl New URLs Discovered In Google Analytics’ option under the ‘General’ tab.
Once you’ve configured everything, click ‘OK.’
3 – Connect to Google Search Console
Connecting to the Google Search Console will help you identify pages that are getting impressions in searches but don’t have internal links. To pull data from Google Search Console for your crawl, you’ll need to connect to the Search Analytics API like you did with Google Analytics.
Once it’s connected, select the appropriate property.
Switch to the ‘Search Analytics’ tab. Check off ‘Crawl New URLs Discovered In Google Search Console.’ You can change the data range if you wish, and then click ‘OK.’
4 – Crawl Your Site With Screaming Frog
Great! Everything is hooked up, so now you can crawl the site. Type the URL you want to crawl in the box at the top, then click ‘Start.’
5 – Configure & Run Crawl Analysis
Once the crawl has finished, configure the Crawl Analysis by clicking on ‘Crawl Analysis’ in the top menu and then ‘Configure.’
Since I’m only interested in Orphan URLs right now, I’ve unchecked certain options. However, you can check every box if you want more data. Just make sure the boxes for Sitemaps, Analytics, and Search Console are checked. If you didn’t connect one of those, leave that box unchecked. Click ‘OK.’
Head back to the Crawl Analysis menu and click ‘Start!’
6 – Find Orphan Pages
When crawl analysis is complete, toggle to either the Sitemaps, Analytics, or Search Console tab at the top.
If you don’t see them beneath where you entered your URL, click on the down arrow next to the tabs and locate them in the drop-down menu. Make sure ‘Sitemap,’ ‘Analytics,’ and ‘Search Console’ are checked. You can also uncheck tabs you don’t want to see.
Navigate to the Sitemap tag. Now you can filter your crawl results to identify orphan pages. Click on the funnel beneath the crawl analysis tabs and select ‘Orphan URLs.’
You’ll need to filter the Sitemap, Analytics, and Search Console tabs the same way, so they only show URLs with no internal links pointing to them that were discovered from one of these sources.
If you see any URLs pop up as orphaned, check their status code to determine if they’re live pages. You can even use Screaming Frog filters to include certain status codes so you can approach live pages and redirects or error pages separately by filtering the Status Code column. If you’ve already redirected or removed the page, you should be fine. Just be sure to also remove it from your sitemap.
When I ran an analysis on our site, I only found one potential orphan page. Upon further examination, I saw the page had already been redirected.
If you need to save the discovered URLs, click on ‘Export.’
This will only export the data from that one tab. If different URLs appear under your tabs, you must export them separately.
How To Fix Orphan Pages
While most orphan pages are the result of poor linking practices, some happen naturally and exist only for a short amount of time. Since these pages will resolve themselves eventually, there’s not much to do about them.
For example, pages with a 400 status code may still exist on your site as a URL. Since Google has already crawled and indexed them, they’re still technically orphan pages. If these 400-status orphan pages are from the sitemap tab, remove the URLs from the sitemap so Googlebot doesn’t waste time crawling these error pages. If they’re from the Analytics or Search Console tab, Google likely discovered the pages from a link that has already been removed or from an external link. Either way, they will eventually learn to ignore the URL due to the 400 status.
Otherwise, here are a few things you can do to clean up orphan pages on your website.
1. Is It Necessary?
Coming across an orphan page is also an opportunity to consider the importance of a page. If it ended up orphaned, is it really a page you want to keep? If not, redirect or delete the page.
If you do want to keep the page and you don’t want Google to index it, refrain from linking to it internally and implement a noindex robots meta tag to let search engines know they shouldn’t index it.
For pages that you do want to index, follow the next three steps.
2. Add It to Your Sitemap
If you discovered the orphan page through the Analytics or Search Console tab and it is not in the sitemap, add it. It will still technically be an orphan page till you add internal links to the page, but the sitemap will help Googlebot crawl the page.
If you have a dynamically generated sitemap, then any indexable page you create will automatically be added to it. Otherwise, you need to add each page you want indexed to your sitemap. Make this a regular practice to avoid inadvertent orphan pages.
3. Link to It
The obvious fix for an orphan page is an inbound link. Creating an internal link to the page from a relevant page on your website will allow it to be found by both search crawlers and site visitors.
Any orphan page you deem valuable should be integrated into your internal linking strategy. Have a good look around your site and find at least one appropriate page to place the link on, ideally with descriptive anchor text.
4. Submit It to Google Search Console
If you want the orphan page to be indexed, you should also take the time to submit it to Google Search Console’s URL Inspection Tool for indexing after you’ve linked it internally.
In addition to providing you with valuable information about the current index status of your pages, the URL Inspection Tool lets you directly request that Google crawl a specific URL. Submitting a request will make sure Google knows it should crawl, index, and rank the page without having to wait for Googlebot to find the new internal link on its own.
Prevent Orphan Pages With an Internal Linking Strategy
The best way to prevent orphan pages is to maintain a clear site structure and follow a solid internal linking strategy. A good site structure divides content up into different sections and categories with links directing user traffic deeper into each section. Keep your site structure in mind when expanding your website to thoughtfully create linking opportunities.
Tackle SEO Concerns With a Qualified Partner
There’s no need to struggle with SEO problems like orphan pages — tackle them head-on with the help of a qualified partner. Victorious’ SEO site audit services will help you uncover a variety of on-page and off-page issues that may impact your ranking potential. If you’re looking to take your site’s SEO to the next level, schedule a free SEO consultation to get started.