|
| 1 | +# Sitemaps |
| 2 | + |
| 3 | +## About |
| 4 | + |
| 5 | +Sitemaps can be in either TXT or XML format, with the XML being the more common. |
| 6 | +Information about sitemaps can be found at https://www.sitemaps.org. |
| 7 | + |
| 8 | +It is recommended that all data set landing pages be listed in a sitemap. |
| 9 | + |
| 10 | +If your web site already has a sitemap, it is fine to add the URLs for |
| 11 | +your landing pages in that sitemap. Systems like Gleaner and of course |
| 12 | +Google and others will inspect pages for the desired JSON-LD data graph |
| 13 | +packages. |
| 14 | + |
| 15 | +## Sitemap Index |
| 16 | + |
| 17 | +You can also use a sitemap index which is in effect a sitemap of sitemaps. |
| 18 | +An index might look like: |
| 19 | + |
| 20 | +<?xml version="1.0" encoding="UTF-8"?> |
| 21 | +<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> |
| 22 | + <sitemap> |
| 23 | + <loc>http://www.example.com/sitemap1.xml</loc> |
| 24 | + <lastmod>2004-10-01T18:23:17+00:00</lastmod> |
| 25 | + </sitemap> |
| 26 | + <sitemap> |
| 27 | + <loc>http://www.example.com/sitemap2.xml</loc> |
| 28 | + <lastmod>2005-01-01</lastmod> |
| 29 | + </sitemap> |
| 30 | +</sitemapindex> |
| 31 | + |
| 32 | +In this case sitemap1.xml might be your information and general site |
| 33 | +pages. Then, sitemap2.xml could be dedicated to your data set landing pages. |
| 34 | + |
| 35 | +A sitemap can only have 50,000 entries, so if you have more than that you will |
| 36 | +also need to use a sitemap index to spread the entries across 50K or less chunks |
| 37 | +with the files being referenced in the index. |
| 38 | + |
| 39 | +## Robots.txt |
| 40 | + |
| 41 | +You can also list sitemaps in your robots.txt file and there are some |
| 42 | +interesting things you can do there as well to direct various agents to |
| 43 | +different results or sitemaps. |
| 44 | + |
| 45 | + |
| 46 | +Ref: https://tools.ietf.org/html/draft-rep-wg-topic-00 |
| 47 | + |
0 commit comments