Automatic sitemap.txt with routing

I’ve been doing some digging into the SEO when using the beta routing dependency. I starting thinking that it would be pretty easy to automatically generate a sitemap.txt file and serve on demand.

So, doing some digging I found the routing.router._route.sorted_routes and just parsed each route into a media object to serve as a response to the /sitemap.txt route. Looking at the google search console, it seems happy with the result.

I added an optional private attribute when I was defining Route that then get excluded from the sitemap.

Anyways… Here we are in the server side route module:

server/ServerRoutes.py

from . import routes

import anvil.server

@anvil.server.route('/sitemap.txt')
def get_sitemap():
    # putting these in here so we don't import these all the time
    from routing.router._route import sorted_routes
    import anvil.server
    import anvil.media

    # Grab the app origin url
    origin = anvil.server.get_app_origin()

    # I don't really know if this bare url is necessary, but what's the harm?
    sitemap = [origin]

    # Go through each of the defined routes
    for route in sorted_routes:
        # We can mask routes when defining them by adding a private=True attribute
        if not getattr(route, "private", False):
            sitemap.append(f"{origin}{route.path}")

    # Compile the routes, one per line
    file_contents = "\n".join(sitemap).encode()
    file = anvil.BlobMedia(content_type="text/plain", content=file_contents, name="sitemap.txt")

    # serve up our up-to-date sitemap.
    return anvil.server.HttpResponse(200, file)

For my example here my client/routes.py file looks like:

from routing.router import Route

class IndexRoute(Route):
    path = "/"
    form = "Pages.Home"
    
class AboutRoute(Route):
    path = "/about"
    form = "Pages.About"
    
class FaqRoute(Route):
    path = "/faq"
    form = "Pages.FAQ"

class PrivateRoute(Route):
    path = "/private"
    form = "Pages.Private"
    private = True

Then at my_custom_domain.com/sitemap.txt you get something like:

https://my_custom_domain.com
https://my_custom_domain.com/
https://my_custom_domain.com/about
https://my_custom_domain.com/faq
5 Likes

I guess while I’m at it, might as well do the robots.txt file to point to the sitemap.txt file. Tested it with Google Search Console, which said it was found and valid.

from . import routes

import anvil.server


def create_text_file(lines: list, file_name: str):
    """ Create a text file from a list of strings """
    import anvil.media
    file_contents = "\n".join(lines).encode()
    return anvil.BlobMedia(content_type="text/plain", content=file_contents, name=file_name)


@anvil.server.route('/sitemap.txt')
def get_sitemap():
    """ Create and serve a simple sitemap.txt file base on the routes defined in client/routes.py
    More info on sitemaps:
    https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap
    """ 
    # putting these in here so we don't import these all the time
    from routing.router._route import sorted_routes
    
    # Grab the app origin url
    origin = anvil.server.get_app_origin()

    # I don't really know if this bare url is necessary, but what's the harm?
    sitemap = [origin]

    # Go through each of the defined routes
    for route in sorted_routes:
        # We can mask routes when defining them by adding a private=True attribute
        if not getattr(route, "private", False):
            sitemap.append(f"{origin}{route.path}")

    # serve our up-to-date sitemap
    file = create_text_file(sitemap, "sitemap.txt")
    return anvil.server.HttpResponse(200, file)


@anvil.server.route('/robots.txt')
def get_robots():
    """ Create and serve a robots.txt file
    More info:
    https://developers.google.com/search/docs/crawling-indexing/robots/intro
    """
    origin = anvil.server.get_app_origin()
    lines = [
        "# robots.txt",  # Just a dumb header for us dumb humans to read.
        f"Sitemap: {origin}/sitemap.txt"  # Provide the path to the sitemap for crawling
    ]

    # Create our txt file and return it in a http response
    file = create_text_file(lines, "robots.txt")
    return anvil.server.HttpResponse(200, file)
    
2 Likes

That’s nice - we should consider making sorted_routes a public API
And possibly this could be an interesting feature
Maybe even an opt out feature :thinking:

2 Likes