![]() |
SEO: Blocking certain words from being crawled - Printable Version +- Scivillage.com Casual Discussion Science Forum (https://www.scivillage.com) +-- Forum: Science (https://www.scivillage.com/forum-61.html) +--- Forum: Computer Sci., Programming & Intelligence (https://www.scivillage.com/forum-79.html) +--- Thread: SEO: Blocking certain words from being crawled (/thread-2580.html) |
SEO: Blocking certain words from being crawled - stryder - Jun 30, 2016 SEO: Blocking certain words from being crawled It's a lot of junkdata that shouldn't really be used to measure your keywords by, however some search engines aren't particular intelligent at stripping out such information so you end up with your main keywords being something like "login" or "forum", "discussion" etc. There isn't really much that has been said or done in regards to how to deal with this problem. It's not like the HTML consortium has added a nose (no search engines) attribute to every tag to allow you to toggle which element is spiderable (and even if it did it would increase a webpages markup content considerably) So that leaves it down to what search engine's provide in the form of webmaster tools. It's possible for some search engine companies to allow you to login into such tools and remove common words from being weighed. Others companies went with adding their own exclusion method, in Yahoo's case that's including a "robots-nocontent" class of trying to identify to the search engine what content to skip (This however again can lead to some seriously inflated mark-up.) Method using CSS ::before That's why I've devised a method that I think works (it still needs testing by the world at large to see how well it works) It considers certain points:
Firstly a CSS style is applied: Code: .x:before{ This CSS literally says that a tag should output the content of it's title attribute before what's contained as the value. It's application into html would look something like this: Code: <ul> This would cause the words as links of Homepage and Contact to be shown, even though the span tag doesn't have any value. Homepage uses a standard tag method where Contact uses a self closing tag. Apparently it's "Invalid" according to one validation system however it still works. I'd suggest not using this class directly with a (anchor) tags, as anchors can be used for sitemaps and might be used to next images to create image links. While search engines can strip out title attributes for important information, I would hazard a guess that if a tag doesn't have a value then the title is likely skipped. (I can not of course confirm this) I would suggest that if any search engine developers out there spot this post/thread that they consider making their search engine behave how I've mentioned above, as it would make sense to have some form of easily applied standard like this that doesn't get mistreated as "an attempt to inflate keywords through hidden variables." Hopefully this will prove over time to be the most effective way of blocking particular words from crawlers, Please feel free to add any comments, suggestions or additions. If you find this helpful please feel free to syndicate this page or spread a mention of www.scivillage.com RE: SEO: Blocking certain words from being crawled - stryder - Jun 30, 2016 Image of results using veryseotools.com: ![]() RE: SEO: Blocking certain words from being crawled - stryder - Jun 30, 2016 A few hours on and I'm still busy with this, here's another way of doing it if your doing it from scratch. Rather than using already built tags like <span> or <strong> to then add a class and title to, it's possible to actually create your own prototype tag. Code: <html> The great thing about it being a prototype is that people can't complain if a validator throws out an error, afterall it's my prototype so their validators wrong ![]() Using a prototype method cuts back on the amount of markup that is otherwise needed to keep writing class and title everywhere. To be honest you probably don't even need to use the attr CSS trick with a prototype as spiders might well ignore their content anyway since they won't recognise the tag. |