{
    "componentChunkName": "component---src-templates-post-js",
    "path": "/blog/2019/05/05/telemetry-success/",
    "result": {"data":{"blog":{"html":"<div class=\"paragraph\">\n<p>Half a year ago we delivered a security fix for Jenkins that had the potential to break the entire Jenkins UI.\nWe needed to change how Jenkins, through the Stapler web framework, handled HTTP requests, tightening the rules around what requests would be processed by Jenkins.\nIn the six months since, we didn&#8217;t receive notable reports of problems resulting from this change, and it&#8217;s thanks to the telemetry we gathered beforehand.</p>\n</div>\n<div class=\"sect1\">\n<h2 id=\"the-problem\"><a class=\"anchor\" href=\"#the-problem\"></a>The Problem</h2>\n<div class=\"sectionbody\">\n<div class=\"paragraph\">\n<p>Jenkins uses the Stapler web framework for HTTP request handling.\nStapler&#8217;s basic premise is that it uses reflective access to code elements matching its naming conventions.\nFor example, any public method whose name starts with <code>get</code>, and that has a <code>String</code>, <code>int</code>, <code>long</code>, or no argument can be invoked this way on objects that are reachable through these means.\nAs these naming conventions closely match common code patterns in Java, accessing crafted URLs could invoke methods never intended to be invoked this way.</p>\n</div>\n<div class=\"paragraph\">\n<p>A simple example of that is a URL every Jenkins user would be familiar with: <code>/job/jobname</code>.\nThis ends up invoking a method called <code>#getJob(String)</code>, with the argument being <code>\"jobname\"</code>, on the root application object, and having it handle the rest of the URL, if any.\nOf course, this is a URL intended to be accessed this way.\nHow about invoking <code>Object#getClass()</code>, followed by <code>Class#getClassLoader()</code>, by accessing the URL <code>/class/classLoader</code>?\nWhile this particular chain would not result in a useful response, this doesn&#8217;t change that the methods were invoked.\nWe identified a number of URLs that could be abused to access otherwise inaccessible jobs, or even invoke internal methods in the web application server to invalidate all sessions.\n<a href=\"https://jenkins.io/security/advisory/2018-12-05/\">The security advisory</a> provides an overview of the issues we&#8217;d identified by then.</p>\n</div>\n</div>\n</div>\n<div class=\"sect1\">\n<h2 id=\"the-idea\"><a class=\"anchor\" href=\"#the-idea\"></a>The Idea</h2>\n<div class=\"sectionbody\">\n<div class=\"paragraph\">\n<p>To solve this problem inherent in the Stapler framework&#8217;s design, we defined rules that restrict invocation beyond what would be allowed by Stapler.\nFor example, the declared return type of getters now needed to be one defined in Jenkins core or a Jenkins plugin and have either clearly Stapler-related methods (with Stapler annotations, parameter types, etc.) or Stapler-related resource files associated with it.\nOtherwise, the type wouldn&#8217;t be aware of Stapler, and couldn&#8217;t produce a meaningful response anyway.</p>\n</div>\n<div class=\"paragraph\">\n<p>This meant that getters just declaring <code>Object</code> (or <code>List</code>, <code>Map</code>, etc.) would no longer be allowed by default.\nIt was clear to the developers working on this problem that we needed the ability to be able to override the default rules for specific getters.\nBut allowing plugin developers to adapt their plugins after we published the fix wasn&#8217;t going to cut it;\nJenkins needed to ship with a comprehensive default whitelist for methods known to not conform to the new rules, so that updating would not result in problems for users.</p>\n</div>\n</div>\n</div>\n<div class=\"sect1\">\n<h2 id=\"the-solution\"><a class=\"anchor\" href=\"#the-solution\"></a>The Solution</h2>\n<div class=\"sectionbody\">\n<div class=\"paragraph\">\n<p>While there is tooling like <a href=\"https://github.com/jenkinsci/plugin-compat-tester/\">Plugin Compatibility Tester</a> and <a href=\"https://github.com/jenkinsci/acceptance-test-harness\">Acceptance Test Harness</a>, many Jenkins plugins do not have comprehensive tests of their UI&#8201;&#8212;&#8201;the Jenkins UI is fairly stable after all.\nWe did not expect to have sufficient test coverage to deliver a change like this with confidence.\nThe only way we would be able to build such a comprehensive whitelist would be to add telemetry to Jenkins.</p>\n</div>\n<div class=\"paragraph\">\n<p>While Jenkins instances periodically report usage statistics to the Jenkins project, the information included is very bare bones and mostly useful to know the number of installations, the popularity of plugins, and the general size of Jenkins instances through number and types of jobs and agents.\nWe also didn&#8217;t want to just collect data without a clear goal, so we set ourselves some limitations&#8201;&#8212;&#8201;collect as little data as possible, no personally identifiable information, have a specific purpose for each kind of information we would collect, and define an end date for the collection in advance.\nWe defined all of this in <a href=\"https://github.com/jenkinsci/jep/blob/master/jep/214/README.adoc\">JEP-214</a>, created the <a href=\"https://github.com/jenkins-infra/uplink\">Uplink service that would receive submissions</a>, and added the basic client framework to Jenkins.\nThe implementation is fairly basic&#8201;&#8212;&#8201;we just submit an arbitrary JSON object with some added metadata to a service.\nThis system would inform tweaks to a security fix we were anxious to get out, after all.</p>\n</div>\n<div class=\"paragraph\">\n<p>Starting in mid October for weekly releases, and early November for LTS, tens of thousands of Jenkins instances would submit Stapler request dispatch telemetry daily, and we would keep identifying code incompatible with the new rules and amending the fix.\nUltimately, <a href=\"https://github.com/jenkinsci/jenkins/blob/44c4d3989232082c254d27ae360aa810669f44b7/core/src/main/resources/jenkins/security/stapler/default-whitelist.txt\">the whitelist</a> would include a few dozen entries, preventing serious regressions in popular plugins like <a href=\"https://plugins.jenkins.io/credentials\">Credentials Plugin</a>, <a href=\"https://plugins.jenkins.io/junit\">JUnit Plugin</a>, or the Pipeline plugins suite, down to <a href=\"https://plugins.jenkins.io/google-cloud-health-check\">Google Health Check Plugin</a>, a plugin with just 80 installations when we published the fix.</p>\n</div>\n<div class=\"paragraph\">\n<p>Learning what requests would result in problems also allowed us to write better developer documentation&#8201;&#8212;&#8201;we already knew what code patterns would break, and how popular each of them was in the plugin ecosystem.</p>\n</div>\n</div>\n</div>\n<div class=\"sect1\">\n<h2 id=\"the-overhaul\"><a class=\"anchor\" href=\"#the-overhaul\"></a>The Overhaul</h2>\n<div class=\"sectionbody\">\n<div class=\"paragraph\">\n<p>I wrote above:</p>\n</div>\n<div class=\"quoteblock\">\n<blockquote>\n<div class=\"paragraph\">\n<p>For example, the declared return type of getters now needed to be one defined in Jenkins core or a Jenkins plugin and have either clearly Stapler-related methods (with Stapler annotations, parameter types, etc.) or Stapler-related resource files associated with it.</p>\n</div>\n</blockquote>\n</div>\n<div class=\"paragraph\">\n<p>While this was true for the fix during most of development, it isn&#8217;t how the fix that we published actually works.\nAbout a month before the intended release date, internal design/code review feedback criticized the complicated and time-consuming implementation that at the time required scanning the class path of Jenkins and all plugins and looking for related resources, and suggested a different approach.</p>\n</div>\n<div class=\"paragraph\">\n<p>So we tried to require that the declared type or any of its ancestors be annotated with the new annotation <code>@StaplerAccessibleType</code>, annotated a bunch of types in Jenkins itself (<a href=\"https://javadoc.jenkins.io/hudson/model/ModelObject.html\"><code>ModelObject</code></a> being the obvious first choice), and ran our scripts that check to see whether Stapler would be allowed to dispatch methods identified in telemetry.\nWe&#8217;d long since automated the daily update of dispatch telemetry processing, so it was a simple matter of changing which Jenkins build we were working with.</p>\n</div>\n<div class=\"paragraph\">\n<p>After a few iterations of adding the annotation to more classes, the results were very positive: Very few additional types needed whitelisting, while many more were no longer (unnecessarily) allowed to be dispatched to.\nThis experiment, late during development, ended up being essentially the fix we delivered.\nWe didn&#8217;t need to perform costly scanning of the class path on startup&#8201;&#8212;&#8201;we didn&#8217;t need to scan the class path at all&#8201;&#8212;&#8201;, and the rules governing request dispatch in Stapler, while different from before, are still pretty easy to understand and independent of how components are packaged.</p>\n</div>\n</div>\n</div>\n<div class=\"sect1\">\n<h2 id=\"the-outcome\"><a class=\"anchor\" href=\"#the-outcome\"></a>The Outcome</h2>\n<div class=\"sectionbody\">\n<div class=\"paragraph\">\n<p>As usual when delivering a fix we expect could result in regressions in plugins, we <a href=\"https://wiki.jenkins.io/display/JENKINS/Plugins+affected+by+the+SECURITY-595+fix\">created a wiki page</a> that users could report problems on.\nRight now, there&#8217;s one entry on that wiki page.\nIt is one we were aware of well before release, decided against whitelisting it, and the affected, undocumented feature in Git Plugin ended up being removed.\nThe situation in our issue tracker is only slightly worse, with two apparently minor issues having been reported in Jira.</p>\n</div>\n<div class=\"paragraph\">\n<p>Without telemetry, delivering a fix like this one would have been difficult to begin with.\nTinkering with the implementation just a few weeks before release and having any confidence in the result?\nNot causing any significant regressions?\nI think this would simply be impossible.</p>\n</div>\n</div>\n</div>","id":"7783e32b-2866-5124-b6e5-a89740fbfd19","title":"First successful use of Jenkins telemetry","date":"2019-05-05T00:00:00.000Z","slug":"/blog/2019/05/05/telemetry-success/","links":{"discourse":""},"authors":[{"avatar":null,"blog":null,"github":"daniel-beck","html":"<div class=\"paragraph\">\n<p>Daniel is a Jenkins core maintainer and member of the <a href=\"/security/#team\">Jenkins security team</a>.\nHe was the inaugural Jenkins security officer from 2015 to 2021.\nHe sometimes contributes to developer documentation and project infrastructure in his spare time.</p>\n</div>","id":"daniel-beck","irc":null,"linkedin":null,"name":"Daniel Beck","slug":"/blog/authors/daniel-beck","twitter":null}]}},"pageContext":{"next":"/blog/2019/05/09/chinese-localization/","previous":"/blog/2019/04/03/security-advisory/","id":"7783e32b-2866-5124-b6e5-a89740fbfd19"}},
    "staticQueryHashes": ["1271460761","3649515864"]}