Nono.MA

Prevent Laravel Routes from Being Indexed on Google

AUGUST 19, 2018

There is a nifty way to specify the way in which you want each of the pages (or Laravel routes) of your site to be indexed by search engines. In my case, I looked Robots meta tag and X-Robots-Tag HTTP header specifications to learn more about what was possible.

In short, you might tell Google a specific route or page has "no restrictions for indexing or serving" by setting the X-Robots-Tag HTTP header to all or, on the contrary, tell it to stop indexing (or saving cached versions of a page) with the noindex value.

In Laravel, the guys at Spatie made it really easy. Just install their spatie/laravel-robots-middleware composer package on your Laravel app with:

composer require spatie/laravel-robots-middleware

Let's see a few examples on how to use this.

Allow every single page to be indexed and served

Create a new middleware in your application.

// app/Http/Middleware/MyRobotsMiddleware.php

<?php
namespace App\Http\Middleware;
use Illuminate\Http\Request;
use Spatie\RobotsMiddleware\RobotsMiddleware;

class MyRobotsMiddleware extends RobotsMiddleware
{
    /**
     * @return string|bool
     */
    protected function shouldIndex(Request $request)
    {
        return 'all';
    }
}

And then register your new in the middleware stack.

// app/Http/Kernel.php

class Kernel extends HttpKernel
{
    protected $middleware = [
        // ...
        \App\Http\Middleware\MyRobotsMiddleware::class,
    ];
    
    // ...
}

Forbid every single from being indexed, cached, and served

// app/Http/Middleware/BlockAllRobotsMiddleware.php

<?php
namespace App\Http\Middleware;
use Illuminate\Http\Request;
use Spatie\RobotsMiddleware\RobotsMiddleware;

class BlockAllRobotsMiddleware extends RobotsMiddleware
{
    /**
     * @return string|bool
     */
    protected function shouldIndex(Request $request)
    {
        return 'noindex';
    }
}

Conditional robots middleware

Probably, the most interesting application of this middleware is to embed more intelligent logic to avoid indexing specific pages, but letting Google (and other search engines) crawl the pages you want to expose in search engines.

We could send a noindex header for our admin pages only, for instance.

// app/Http/Middleware/SelectiveRobotsMiddleware.php

<?php
namespace App\Http\Middleware;
use Illuminate\Http\Request;
use Spatie\RobotsMiddleware\RobotsMiddleware;

class SelectiveRobotsMiddleware extends RobotsMiddleware
{
    protected function shouldIndex(Request $request) : string
    {
        if ($request->segment(1) === 'admin') {
            return 'noindex';
        }
        return 'all';
    }
}

Remember that you need to add all of your new middlewares to the app/Http/Kernel.php file in order for them to be called before each request. This method can be handing to block search indexing with noindex or to customize the way search engines are allow to process your pages. Here are other directives you can use in the x-robots-tag HTTP header and what they mean.

  • all - There are no restrictions for indexing or serving. Note: this directive is the default value and has no effect if explicitly listed.
  • noindex - Do not show this page in search results and do not show a "Cached" link in search results.
  • nofollow - Do not follow the links on this page
  • none - Equivalent to noindex, nofollow
  • noarchive - Do not show a "Cached" link in search results.
  • nosnippet - Do not show a text snippet or video preview in the search results for this page. A static thumbnail (if available) will still be visible.
  • notranslate - Do not offer translation of this page in search results.
  • noimageindex - Do not index images on this page.
  • unavailable_after: [RFC-850 date/time] - Do not show this page in search results after the specified date/time. The date/time must be specified in the RFC 850 format.

Thanks!

I hope you found this useful. Feel free to ping me at @nonoesp or join the mailing list. Here are some other Laravel posts and code-related posts.

LaravelCode