Make your Mephisto powered site easily spidered by Google, Yahoo and MSN. Enter the sitemap.xml feed.
Mephisto (using 7.3 edge) has some nice built in atom feeds. I could not however figure out how to get a full site feed. Everything was by section. Specifically, I wanted a feed that followed the sitemaps.org specification that Google, MSN and Yahoo use to spider the site.
Kudos to Joseph Moore for doing 90% of the work. His original version used a Google version of the spec (0.84) and put everything to http://mydomain.com/sitemap/ but the spiders look for sitemap.xml unless told otherwise.
Here are the changes I made to /app/views/sitemap/index.rxml
[source:ruby]
# see https://www.google.com/webmasters/tools/docs/en/protocol.html
# http://www.sitemaps.org/protocol.php
xml.instruct! :xml, :version=>”1.0″, :encoding=>”UTF-8″
xml.urlset(:xmlns => “http://www.sitemaps.org/schemas/sitemap/0.9”) do
time_zone = TimeZone.new(@site.timezone.current_period.utc_offset)
# Priority is a relative weighting, the default is 0.5 if not specified. 0.0 – 1.0
# give priority to the homepage (daily, 1.0)
# give priority to the subdirectories (daily, 0.8)
# else priority to the articles based on age
# >= 1 day = 0.9 and weekly
# >= 1 week = 0.8 and weekly
# >= 1 month = 0.5 and monthly
# >= 6 month = 0.3 and yearly
# give priority to the homepage (daily, 1.0)
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}/”)
xml.lastmod(Date.today.strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
xml.changefreq(“daily”)
xml.priority(“1.0”)
end
@sections.each do |section|
if section.name.downcase != “home” then #exclude the home page!
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}/#{Section.permalink_for(section.name.to_s)}”)
#fudge the date to be recent.
xml.lastmod((Date.today-1).strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
xml.changefreq(“daily”)
xml.priority(“0.8”)
end
end
end
@articles.each do |article|
xml.url do
xml.loc(“http://#{request.host_with_port}#{request.relative_url_root}#{site.permalink_for(article)}”)
xml.lastmod(article.updated_at.strftime(“%Y-%m-%dT%H:%M:%S#{time_zone.formatted_offset}”))
age = (Date.today – Date.parse(article.updated_at.to_s)) % (60*60*24)
if age >= 180 then
xml.changefreq(“yearly”)
xml.priority(“0.3”)
elsif age >= 30 then
xml.changefreq(“monthly”)
xml.priority(“0.5”)
elsif age >= 7 then
xml.changefreq(“weekly”)
xml.priority(“0.8”)
else
xml.changefreq(“daily”)
xml.priority(“0.9”)
end
end
end
end
[/source]
Here are the changes I made to /app/controllers/sitemap_controller.rb to support the section lookup.
[source:ruby]
class SitemapController < ApplicationController
layout nil
session :off
def index
@sections = site.sections.find(:all)
@articles = Article.find(:all, :conditions => “published_at is not null”)
end
end
[/source]
I’ve also added an additional route to lib/mephisto/routing.rb to recognize http://mydomain.com/sitemap.xml.
[source:ruby]
def self.connect_with(map)
# Allows access to the sitemap!
map.connect ‘sitemap’, :controller => ‘sitemap’
map.connect ‘sitemap.xml’, :controller => ‘sitemap’
…
[/source]
When I get the chance I’ll make this a plugin. Very handy.
Update:
I forgot the instruct line for the charset in the sitemap.rxml file. I’ve added it above.