Site review for a family social network

SEO Research and Opportunities Report

Summary

Arguably, the largest site’s problem is Content Similarity that’s treated as Duplication. The site’s architecture is made so that the Substance of your content is the same across different pages: consumer news categories, consumer news articles, website categories, etc. It results in inadequate indexing of your sitemap / sites’ pages and poorer user experience which can be used by Google as a predictor of your site ranking performance.

So, the upstream fix of this problem is through a redesign of the landing pages. If it’s not possible, you want to use rel=”canonical” element to tell Google that a certain page is a preferred one.

Another strategy problem is the site is a multi-language site but it does not have annotated language / regional versions. Google determines a page’s language based on the page content itself and would normally expect that you use a single language across the page / site. Google advises that an average user would create one country-code top-level domain (cc-TLD) name per each targeted language / location. So, targeting a sub-domain to a specific audience based on its language and location is not as easy as for example targeting cc-TLD domain. Importantly, the GWT country setting won’t largely help here – they are used when the user elects to see searches from a specific country only.

The strategic decision is up to you: consider switching to each language version to a separate cc-TLD. It will yield better ranking results for local searches in different languages. If not possible, you have to mark up the language / regional versions of your site with hreflang attribute. This way Google understands that a site is Multi-language, not just somehow has content in different languages.

Another very important issue is internal links. A considerable part of internal links are Dynamic ones, which means they require a user input to produce content. Under this scenario, the overall crawlability of the site is likely to be lower than if you used Static (HTML) links. So, my first advice is to consider switching to static links all together. If not possible, use 1) Absolute URL when linking [internally] from page to page, 2) specify all Dynamic URLs with your sitemap, 3) use mod_rewrite t0 rewrite page with more than one parameter in URL.

User experience with your content is separately shown in the Content Strategy report.

“Developers are keep SEOs in business” Jill Whalen

There are a lot of things you can fix upstream and you don’t need to fix them downstream, later on.

Indexing, Sitemaps

As you can see from the Index Time Series in GWT, there are around 100 pages that are being indexed now (Please see Index Status snapshot below). It’s roughly the same what you get with from Google search when search for: https://www.google.com/?gws_rd=ssl#q=site:gb.webfamiliz.com.

Your sitemap contains 2300 pages, while indexed are 83. This is a stable proportion, which indicates a certain problem with content indexing.

The indexing level of ‘char pages’ is very low – only 7 pages are indexed.

Generally, you would greatly improve your indexing and ranking positions if you provide meaningful Titles to Dynamic URLs such as for example, http://gb.webfamiliz.com/websites/66/health-and-personal-care.html?char=C – see below how they currently look on Google. For more detail see Pages title section below.

Apparently, the site has:

a larger problem with URL structure,

Content Duplication and

Dynamic Pages Indexing.

Your site uses two main content themes: Category – consumer news, Websites – Recommended sites. Inherently, their structure is a major source of duplication and crawler confusion.

Let’s examine Consumer News > Automotive http://gb.webfamiliz.com/category/46/automotive.html. The category features a Top Article and several previews of other suggested articles (see below).

More Articles link lead to a Dynamic URL, which is not indexed if not directly included in Sitemap. See that no page with ?all parameter has been indexed so far: https://www.google.com/?gws_rd=ssl#q=site:gb.webfamiliz.com+More+articles

It means two things:

Low crawlability of the site: most of the articles get hidden from the crawler
Otherwise potentially good landing page spoiled: if hidden from crawler and lacking internal links, a page, though adequate in content quality, has low chances of getting ranked with Google.

Interestingly enough, I have found that, apparently, the Top Article content may change as a user browse the site and then visits the page again (see above a snap shot of the same category).

If so, I personally can’t understand a practical purpose of it.

When a user clicks to read on (Learn more link) he /she would land on http://gb.webfamiliz.com/articles/46/1938/top-tyre-tips-what-to-look-out-for.html, which is a link to Article Full Body and a Partner sites widget (see below).

Importantly, as a user reloads the content a new array of partner listings appear (see below). A brief analysis of the links shows that they are largely irrelevant to the theme of the Article itself.

Generally, Google most likely will consider a Partner sites widget as Third Party content, which is no add value or even detrimental to your content quality. For example, Google thinks it best for users to see a page with only a small piece of original content and all the rest - text and links – irrelevant to the topic of the page.

One way to change it is to feature only the most up to the point sites. Other sites that may be not 100% relevant but if you think you shall feature them too – you may want to make sites with Second order relevance look different, for example, through applying different styles in order to bring a better user experience.

Now, let’s go the same way with Recommended sites > Automotive http://gb.webfamiliz.com/websites/46/automotive.html

The category features a Top Article as well as [a variation of] a Partner sites widget (see below).

So, from the Google perspective the Consumer News > Automotive http://gb.webfamiliz.com/websites/46/automotive.html and Recommended websites > Automotive http://gb.webfamiliz.com/category/46/automotive.html have a Duplicate content since two pages have similar Substance. Seemingly, one does not bring much added value to user experience as compared to the other.

The Real Stuff on both pages is the same and it links the same page that lays in the Articles category

So, the site’s architecture points to the Real stuff that should be indexed in the first place. And, most probably, Google will prefer crawling and indexing the Articles along and consider other pages, including

articles category [landing] pages, for example Automotive
recommended websites category [landing] pages as well as
dynamic URL pages

as duplicates.

Predictably enough, the support to this hypothesis is right out there: https://www.google.com/?gws_rd=ssl#q=site:gb.webfamiliz.com+articles

By contrast to the results the link above demonstrates, there is only 1 result representing websites category pages in Google index: http://gb.webfamiliz.com/websites/53/appliances.html

Yet, there are 4 pages with dynamic URL representing websites category parameters in Google index:

Source: https://www.google.com/?gws_rd=ssl#q=site:gb.webfamiliz.com+websites

Dynamic URLs

Website URLs with parameters (parameters start with a question mark (?)) are called dynamic URLs. A dynamic url is a document that requires the webserver to do some computation before returning the web document. So, most often a dynamic URL is a result of a user session, because a user may pass the command to a web server.

Googlebot is not a browser, and if you don’t specify a Dynamic URL with your Sitemap or through an internal link Googlebot won’t index it, as it won’t trigger the action a user would.

According to your sitemap http://gb.webfamiliz.com/sitemap.xml there’s no declaration of pages with dynamic parameters.

Internal links looks as follows <a href="?char=A#list" class=" ">A

Source: view-source:http://gb.webfamiliz.com/websites/49/professional-equipment-and-supply.html?char=B

Therefore, URLs like

are a little obscure from Googlebot and will only be indexed as a result of spontaneous action.

I recommend that you specifically mentioned all URLs that have meaningful content in your sitemap, including

http://example.com/category/name.html ?all and
http://example.com/websites/.../name.html?char=* or
html?char=*#list or html?page=* html?fw=true format.

Also, I recommend that you used a correctly formatted internal link to prevent Googlebot messing up with pages like, for example, http://gb.webfamiliz.com/websites/49/professional-equipment-and-supply.html?char=B

It means that you have to use an Absolute URL, a link tag with a proper link referral location (see below).

Source: http://moz.com/learn/seo/internal-link

As suggested above you currently use a relative URLs for example <a href="?char=A#list" class=" ">A

Recommendations

So, my recommendation on the Indexing and Site Architecture part will be the following.

Fix things upstream, not downstream

1. Change the URL structure to the most simple and clear one.

Clear URL structure will bring better understanding of your site’s structure by Googlebot. The earlier you do it, the better. Besides, it is really easy to specify changes to the URL structure in Wordpress. Once, done you would want to put 3o1 (permanent) redirect with your htaccess file from the old URL path to the new created one.

Suchwise, for example, Consumer news shall feature as the URLs category, the same as the name of the sub-category, for example, Grocery & Gourmet Food.

So, the URL for Grocery article listings will be http://gb.webfamiliz.com/consumer-news/grocery-gourmet-food/ instead of http://gb.webfamiliz.com/category/45/grocery-gourmet-food.html.

The individual article URL may look as follows: http://gb.webfamiliz.com/consumer-news/grocery-gourmet-food/where-to-buy-organic-food-online.html instead of http://gb.webfamiliz.com/articles/45/1900/where-to-buy-organic-food-online.html

Among other, it will help you create a Rich Snippet that hasn’t been successfully implemented on the site as yet – see more details in Breadcrumbs section below.

Websites entries may have the following URL http://gb.webfamiliz.com/brands/hungry-house.html

Also, you would want to use mod_rewrite to drop off the php in the end of the line.

2. Use links to Static URLs instead of those to Dynamic URLs where possible. As discussed Dynamic URLs require user input, so they can’t be accessed by a Googlebot. Also, Dynamic URLs with several parameters sound as a High Risk of Duplicate content to Google. Here’s Google official position:

Your pages are dynamically generated. We're able to index dynamically generated pages. However, because our web crawler could overwhelm and crash sites that serve dynamic content, we limit the number of dynamic pages we index. In addition, our crawlers may suspect that a URL with many dynamic parameters might be the same page as another URL with different parameters. For that reason, we recommend using fewer parameters if possible. Typically, URLs with 1-2 parameters are more easily crawlable than those with many parameters."

So, on the strategy side you would want to supplant links to Dynamic URLs to easily crawlable Static links on page. If you can’t do that right now – specify Dynamic URL in Absolute URL format in your sitemap. Also, use mod_rewrite to rewrite the URLs of the paths with two parameters, for example, http://fr.webfamiliz.com/websites/28/sports-loisirs-et-arts-crA-atifs.html?page=2&char=S. Importantly, the rewritten URL has to feature a nice Title, which is not there (see below).

So, I recommend using Titles relevant to the content – for example, Best Sport Sites starting with S (page 1)

3. Use Canonical URLs to consolidate link signals for similar content. In practice, Google has already picked the most original and high quality content from your site, which results in many pages being left out without indexing as they apparently looks as Similar / Duplicates to Google. So I suggest that you create a Canonical version of each landing page – An article page – and mark other pages [where this article features], for example, Consumer News and Websites category pages with rel=”canonical” link element.

Example:

Add a <link> element with the attribute rel="canonical" to the <head> section of these pages: <link rel="canonical" href="http://blog.example.com/dresses/green-dresses-are-awesome"/>

This indicates the preferred URL to use to access the green dress post, so that the search results will be more likely to show users that URL structure.

Source: https://support.google.com/webmasters/answer/139066?hl=en

4. Use correctly formatted internal links. It means that you would want to specify an Absolute URL instead of Relative URL as an internal link.

5. Use 301 redirect to normalize the look of your pages’ URLs. For example, you may redirect http://gb.webfamiliz.com/index.php to http://gb.webfamiliz.com/

International targeting

Google uses International targeting to increase relevance of search results for multinational and / or multiregional websites. To find the content’ targeted country Google uses a page’s language and local information (contact, maps, etc).

Notably, Country information, for example, International Targeting > Country settings in Google Webmaster Tools (GWT) are used by Google to filter down Search results from a Specific Country ONLY (source: https://support.google.com/webmasters/answer/62399?hl=en), and will not in any way substitute for language declaration.

Generally, Google determines a pages language by using only the visible content. They don’t use any code-level language information such as lang attributes. HTML declaration that you, for example, use <meta name='language' content='fr'> are ignored by Google.

To provide a better understanding of your site’s content to Google I advise you marked up the regional versions of your site by hreflang tags.

In the first place, you would want to bring your ‘Regional Links’ from the Footer (see below) to the Header to indicate that your site has Regional Versions

See, for example, like CNN.com does it (International | Mexico | Arabic).

Obviously, it’s important to declare to Google that you are a multiregional site right away. You can leave Regional links in the footer as well.

Secondly, you would want to declare the regional versions of the Main Page and, additionally, other pages (if any) that language versions.

To do that you have to establish language values. For example,

de: German content, independent of region
en-GB: English content, for GB users

I suggest using the most precise declaration to increase probability of ranking in countries you target:

en-GB: English content, for GB users
fr-FR: French content, for France users
de-DE: Deutch content, for Germany users
it-IT: Italian content, for Italy users

After establishing language and optionally regional pairs you would want to declare those values in the Header of your Main page.

Suppose the default page is http://gb.webfamiliz.com/ the Header will contain the following statements

AND the English version must include a rel=”alternate” hreflang=”x” link to itself in addition to links to French, Deutch, and Italian version:

"TITLE" – substitute with your site’s version title

In the case of other regional pages, you would want to mark up the Header accordingly. For example, if the default page is http://fr.webfamiliz.com/ the Header will look as follows.

"TITLE" – substitute with your site’s version title

Importantly, you would want to annotate every language version of your page (if any) with hreflang For example, if you have a French version of http://gb.webfamiliz.com/articles/65/2225/where-to-buy-a-domain-name.html you would want to hreflang cross-annotate it with its counterpart.

Hreflang will: Help the right country/language version of your cross-annotated pages appear in the correct versions of *google.*

Note: if page A links to page B, page B must link back to page A. If this is not the case for all pages that use hreflang annotations, those annotations may be ignored or not interpreted correctly.

Thirdly, you shall update the JavaScript links menu: the links source code have also contain hreflang annotations. For example, for the French language version the JavaScript shall produce the following code <link href="http://fr.webfamiliz.com/" hreflang="fr">

Fourth, you may indicate alternate language pages with your sitemaps.

More specific instructions: https://support.google.com/webmasters/answer/2620865?hl=en

Expected result: Google will have a clear understanding that your site is a multiregional one. Sub-domains are more likely to appear in local search results for the countries specified.

Google normally expects that a site uses a single language for content and navigation on each page (across the site). It reasonably is considered a good user experience and helps Google determine the language correctly.

Google normally assumes that a single site will use a single language. Therefore, Google will target content to a specific location based mostly on a Country-code top-level domain name (ccTLDs). So, it’s advised to mark up (annotate) links to content in different languages to help Google determine the language correctly.

Also, I have found that multiple pages have content in a wrong language, for example, http://gb.webfamiliz.com/category/52/hifi-video-photos-photo-printout.html or http://gb.webfamiliz.com/websites/48/books-music-films-and-entertainment.html despite the HTML lang attribute.

It is rather confusing to Googlebot. So, I advise keeping a close eye on the content adequacy to the language / regional version.

Pages titles

More details on Pages Titles (for example, how to construct them) is given in the Content Strategy report. Here I would concentrate on two things: necessity to have titles and title structure. The general observation is that Page titles are not completed as they should.

Page title together with a description of it in the search results is called a snippet. So, Google wants a snippet to be as relevant to the page’s content as possible. Google computes a keyword(s) in the context of the snippet to tell users whether a page is going to be useful.

<title>The Title of the Page</title> The contents of this tag are generally shown as the title in search results (and of course in the user's browser)

Titles are the most important element that a search engine indexes, so you want all your pages to have them.

According to https://www.google.co.uk/?gws_rd=ssl#q=site:gb.webfamiliz.com you are not focus much on the Titles Completeness – wording and structure of your titles.

On the titles’ structure side you are going to have a consistent use of semantic elements: site title, category / subcategory title, and article / post title so that they find a uniform application across all your pages.

Site title commonly features the brand name and the Position, which go as a Subject and Predicate , for example, Web Familiz - Private social network. Now, you don’t have established page title at all – look https://www.google.co.uk/?gws_rd=ssl#q=site:gb.webfamiliz.com and you’ll see no uniformity in different pages titles.

Category / Subcategory Titles generally only feature the name of the Category / Subcategory and site title, for example, Gifts & Flowers Tips | Web Familiz - Private social network

Article / Post Title may be built to the following formulae:

{Subject + Predicate} | {Site Title}

Whereas, Brand names may use the following rule:

{Brand} – {Category} | {Site Title}

Importantly, a page’s title must correspond to the pages <h1> header. See example below – page title is information while the header is Media (the URL path is Medias).

So, I recommend set up formulae / rule for page title and apply it uniformly.

The Meta [description]

Webfamiliz.com is a magical website that will allow you to communicate with your family and people you feel close to.

The Meta description is not set for the majority of pages, including the Main. So, Google has auto generated Snippets for them. Most popular snippet is a Cookie Notice which is not up to the point (please see a snapshot below).

In case Meta description is set it’s mostly not Up to the point either. For Google Meta description tag is a handy way to make a proper snippet of a site. If the Meta description is really compelling, the person who sees it may click through more often. So, if you pay attention to conversion, you probably need well written Meta Descriptions – informative, precise and intriguing. Currently, they don’t look so. Please see a Meta Description of the ‘About us’ page below

or ‘Hints, Tips and Reviews’ about Dating

Reference: https://www.youtube.com/watch?v=W4gr88oHb-k

Breadcrumbs

Breadcrumbs are used to enhance user experience from navigation of your site and also give Google additional understanding on the relation of the parts of content on your site. With your site Breadcrumbs are not working properly.

For example, you have marked up Main page > Consumer News [Topic]> Pet’s supplies [sub-topic] but failed to mark the individual article within the subcategory, i.e. Cat Food Buying Guide.

Let’s see how this article looks - below (ref: http://gb.webfamiliz.com/articles/68/2261/cat-food-buying-guide.html)

Obviously, the Breadcrumbs microdata can’t be found on the page as it uses different URL architecture than pages that stand upper in the hierarchy. As advised above, I suggest that you changed the overall URL architecture and used it consistently.

This way you can just mark every link with <div itemscope itemtype="http://data-vocabulary.org/Breadcrumb"> and that’s all – the breadcrumbs will appear OK. If not possible to change the URL structure, you shall use the child property to specify the next item in the hierarchy.

More details: https://support.google.com/webmasters/answer/185417?hl=en

Showing sites’ content in Frames

The site uses frames that feature recommended websites content (see below).

Generally, frames can cause problems for search engines because they don't correspond to the conceptual model of the web. In this model, one page displays only one URL. Pages that use frames or iframes display several URLs (one for each frame) within a single page. Source: https://support.google.com/webmasters/answer/34445?hl=en

You have provided for excluding /redirect.php from crawling.

Still another issue which persists is frames are bad for user experience. They cause confusion when bookmarking, emailing links and breaking w/iFrame busters, etc. It’s an inconsistent/wonky user experience.

So, my general advice is to make the user frames experience more comfortable. You may want to provide a link / button to close a frame and proceed to the destination page's content. It will greatly enhance user experience.

Importantly, I have found out that sometimes the framed view disappears, arguably, after a time out, which produces ‘classic’ Google Analytics Custom Campaign parameters in the URL (see below).

Source: https://support.google.com/analytics/answer/1033867?hl=en&ref_topic=1032998

If this observation is correct the site’s designer have been planning for a gateway from frames. So, it can be a good starting point to implement a certain gateway and test users’ feedback.

Besides, as the link in red shows above, your partners have been using Custom Campaigns to track performance of referral visitors from your site. It means that you should be using Google Analytics too, at least to verify, for example, the accuracy of the e-Commerce data and, probably, upload them to Universal Analytics for the Marketing purposes. So, at this point consideration of practical mind would, in my opinion, outweigh those of privacy, etc.

Other notices

Sub-domain specific notices

Obviously, the registration form does not work properly right now. I could not sign up as I was having an error with Date of birth (see below).

You would want to disallow crawling URLs that appeared as a result of application of Search on the internet filter. Please see an example of one such page below.

It has been stuck in the Google index with a rather weird name (see below).

To prevent this from happening again, use robots.txt to disallow paths like /search.php?* from crawling.

Language version of your site – index status ambiguities

As is seen on Google Webmaster Tools snapshot there are more than 3000 pages of fr.webfamiliz.com in Google index.

According to https://www.google.com/?gws_rd=ssl#q=site:fr.webfamiliz.com there are only 64 pages that are actually in Google index.

My hypothesis is Google is a little messed up with your site’s architecture, which produces lots of similar content, dynamic links that may look like duplicates, as well as the language ambiguity – which language do you actually use. As a result, it obviously picks the URLs which are most safe as regards its guidelines and puts them on the search. So, once you follow the recommendations I give you, Google will feel much more relaxed about your site.

PreviousSite review for a sport betting affiliate NextResearch

Last updated 2 years ago