Rewrite hexdocs.pm canonical links to per-package subdomains#119
Merged
Conversation
Member
|
Do we need to rewrite the pages, given we redirect anyway? 🤔 |
Member
Author
|
Long term the plan was to deprecate the redirects but it's the only way we can get things like the sitemap to work so I guess we will keep them forever? I think presenting one URL format can help for SEO though and help search engines dedup pages. |
Member
|
What we really need to fix is the canonical. We have fixed it in the past though, right? So maybe do something targeted to that? |
ExDoc emits a <link rel="canonical"> tag (when the package sets the :canonical option) pointing at the old path-based URL, https://hexdocs.pm/<package>/... . Now that docs are served from per-package subdomains, that canonical points away from where the page actually lives, splitting SEO signal. Rewrite the canonical tag at ingestion time in the file rewriter so it points at https://<package>.hexdocs.pm/... , reusing package_to_subdomain for the underscore-to-hyphen mapping and upgrading http to https. The bare apex, apex files such as sitemap.xml, and canonical links that already use a subdomain are left untouched. Body links and other tags are intentionally not rewritten: a permanent redirect from the old URLs preserves link equity via 301, so canonical is the only tag where the rewrite changes SEO behavior.
890beb4 to
d8209ac
Compare
Member
Author
|
Updated to only rewrite canonical links |
josevalim
approved these changes
May 31, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ExDoc emits a
<link rel="canonical">tag (when the package sets the:canonicaloption) pointing at the old path-based URL,https://hexdocs.pm/<package>/.... Now that docs are served from per-package subdomains, that canonical points away from where the page actually lives, splitting the SEO signal between the apex path and the subdomain.This rewrites the canonical tag at ingestion time in
Hexdocs.FileRewriterso it points athttps://<package>.hexdocs.pm/..., reusingpackage_to_subdomain/1for the underscore-to-hyphen mapping and upgradinghttptohttps. The bare apex, apex files such assitemap.xml, and canonical links that already use a subdomain are left untouched.Body links and other tags are intentionally not rewritten: a permanent 301 redirect from the old URLs preserves link equity, so the canonical tag is the only place where rewriting changes SEO behavior. Only
.htmlfiles are processed.