url: add distribution and version to user-agent request header (BC)

As a server operator, I've always wanted to know what Mercurial
version clients are running so I can track version adoption and
make informed decisions about which versions of Mercurial to
support in extensions. Unfortunately, there is no easy way to discern
this today: the best you can do is look for high-level feature usage
(e.g. bundle2) or sniff capabilities from bundle2 commands. And these
things aren't changed frequently enough to tell you anything that
interesting.

Nearly every piece of software talking HTTP sends its version in
the user agent. This includes web browsers, curl, and even Git.

This patch adds the distribution name and version to the user-agent
HTTP request header. We choose "Mercurial" for the distribution
name because that seems appropriate. The version string comes
from __version__.

The value is inside parenthesis for a few reasons:

* The version *may* contain spaces
* Alternate forms like "Mercurial/<version>" imply structure and
  since the user agent should not be used by servers for protocol
  or feature negotiation/detection, we don't want to even give the
  illusion that the value should be parsed. A free form field is
  the most hostile to parsing.

Flagging the patch as BC so it shows up in release notes. This
change should be backwards compatible. But I wouldn't be surprised if
a server somewhere is filtering on the exact old user agent string. So
I want to make noise about this change.
This commit is contained in:
Gregory Szorc 2016-07-14 19:16:46 -07:00
parent 8140c5f97f
commit 7f2f118763

View File

@ -505,8 +505,20 @@ def opener(ui, authinfo=None):
handlers.extend([h(ui, passmgr) for h in handlerfuncs])
opener = urlreq.buildopener(*handlers)
# 1.0 here is the _protocol_ version
opener.addheaders = [('User-agent', 'mercurial/proto-1.0')]
# The user agent should should *NOT* be used by servers for e.g.
# protocol detection or feature negotiation: there are other
# facilities for that.
#
# "mercurial/proto-1.0" was the original user agent string and
# exists for backwards compatibility reasons.
#
# The "(Mercurial %s)" string contains the distribution
# name and version. Other client implementations should choose their
# own distribution name. Since servers should not be using the user
# agent string for anything, clients should be able to define whatever
# user agent they deem appropriate.
agent = 'mercurial/proto-1.0 (Mercurial %s)' % util.version()
opener.addheaders = [('User-agent', agent)]
opener.addheaders.append(('Accept', 'application/mercurial-0.1'))
return opener