The Importance of Code Privacy

How OtterWise keeps your business safe while tracking code health

For many businesses, their code is their intellectual property (IP), and granting third parties access to it requires thorough consideration and review of the third party and, in some cases, contracts, certifications and more to ensure trust.

Even if the third-party vendor is a good actor, and keeps up with the best security standards, bad things can still happen, which can put your business at risk.

This, among other reasons, is why Code Privacy should be an important metric when picking software vendors. Tools you use today might be able to access your source code; automated code style fixers, error monitoring software and more. Anything with a Git integration might have been given permissions to access your private repositories.

OtterWise is built on a foundation that prioritises simplicity, and privacy. By reducing the information that OtterWise can access, we lower the risk for our clients, and ourselves.

Tracking Code Coverage, Without Accessing Code

Building a code coverage tool that does not have access to code is not an easy feat, especially when Git providers make needed API permissions too lax.

So, to avoid ever coming in to contact with source code, some measures had to be taken. First, let's inspect where code could have been accessed:

  • Git Diff (for Patch Coverage)
  • Coverage Files (for example coverage XML clover)
  • File viewer (inside OtterWise)

Let us tackle each of the things, one by one:

Making the Git Diff Private

A unified git diff usually looks something like this:

diff --git a/resources/assets/js/orders/order-manage.vue b/resources/assets/js/orders/order-manage.vue
index 1592c4dba3..607fe5a208 100644
--- a/resources/assets/js/orders/order-manage.vue
+++ b/resources/assets/js/orders/order-manage.vue
@@ -554,6 +554,7 @@
 								<col width="90" />
 								<col width="140" />
 								<col width="100" />
+								<col width="100" />
 								<col width="80" />
 							</colgroup>
 							<thead>
@@ -573,6 +574,9 @@
 										({{ item.currency || global_setting("user_account.settings.system.currency") }})
 									</th>
 									<th class="text-right">{{ trans("misc.discount") }}</th>
+									<th class="text-right">
+										{{ trans("misc.profit") + " %" }}
+									</th>
 									<th class="text-right">
 										{{ trans("misc.total") }}
 										({{ item.currency || global_setting("user_account.settings.system.currency") }})

Currently full of source code, which we don't want. These files are necessary to figure out the patch coverage, but the code itself is not.

What is interesting to us, are the file names, as well as the diff markings (+/- etc.)

So to make it private, all we got to do is trim out the code. Our open-source uploader script handles this, here is a rough example:

// Split diff into an array of lines
$diffLines = explode("\n", $diff);

// Strip code!
foreach($diffLines as $index => $line) {
        // Skip everything we want to keep

        if(Str::startsWith($line, 'diff --git a/')) {
            continue;
        }

        if(preg_match('/^(new file mode [0-9]{6})/', $line)) {
            continue;
        }

        if(preg_match('/^(deleted file mode [0-9]{6})/', $line)) {
            continue;
        }

        if(preg_match('/^(index ([0-9a-zA-Z]{7})\.\.([0-9a-zA-Z]{7}))/', $line)) {
            continue;
        }

        if(Str::startsWith($line, '--- ')) {
            continue;
        }

        if(Str::startsWith($line, '+++ ')) {
            continue;
        }

        if(preg_match('/^(@@ -[0-9]{1,}(,[0-9]{1,}){0,1} \+[0-9]{1,}(,[0-9]{1,}){0,1} @@)/', $line, $matches)) {
            $lines[$index] = $matches[1];
            continue;
        }

        // Otherwise get first character (+ or -)

        $lines[$index] = $line[0];
}

// Put back together the diff for later usage
$diff = implode("\n", $diffLines);

The output will now be:

diff --git a/resources/assets/js/orders/order-manage.vue b/resources/assets/js/orders/order-manage.vue
index 1592c4dba3..607fe5a208 100644
--- a/resources/assets/js/orders/order-manage.vue
+++ b/resources/assets/js/orders/order-manage.vue
@@ -554,6 +554,7 @@



+



@@ -573,6 +574,9 @@



+
+
+



Notice how now, it is mostly empty lines, except in those places with a -/+ (to indicate diff). Filenames are kept for reference and to track the history of each file. This process happens in the CI workflow, so OtterWise never sees the original diff.

Coverage Files

The another point of contact with code is the Coverage Files (clover, etc.) which can contain class and method names, which some might deem sensitive. Therefor, similarly to the git diff, we strip away any data we can get away with, as it is not needed. A coverage clover might look like this (trimmed for brevity):

<?xml version="1.0" encoding="UTF-8"?>
<coverage generated="1672008447">
    <project timestamp="1672008447">
        <package name="LasseRafn\CsvReader">
            <file name="/home/runner/work/csv-reader/csv-reader/src/Reader.php">
                <class name="LasseRafn\CsvReader\Reader" namespace="LasseRafn\CsvReader">
                    <metrics .../>
                </class>
                <line num="53" type="method" name="__construct" visibility="public" complexity="4" crap="4.02" count="17"/>
                <line num="54" type="stmt" count="17"/>
                <line num="55" type="stmt" count="1"/>
                <line num="58" type="stmt" count="17"/>

But after stripping values, the file being sent to OtterWise looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<coverage>
    <project>
        <package>
            <file name="/csv-reader/csv-reader/src/Reader.php">
                <class>
                    <metrics .../>
                </class>
                <line num="53" type="method" complexity="4" crap="4.02" count="17"/>
                <line num="54" type="stmt" count="17"/>
                <line num="55" type="stmt" count="1"/>
                <line num="58" type="stmt" count="17"/>

Notice how class names, method names and namespaces have been removed. We have to keep line "type", as it indicates if the line is a statement, method, or something else, which is important in coverage tracking; however, the actual line content is never retrieved. Filenames are kept for referencing and keeping history of coverage across files.

File Viewer inside OtterWise

Note, for public repositories, we permit the File Viewer without any extra opt-in or extensions, as the code is publicly available. This section primarily focuses on Private Repositories, although the options are available for public ones too.

This one is a bit trickier, since it requires actually displaying code in the browser, inside OtterWise. I decided to go with multiple solutions to the problem, to avoid the user having a worse experience from using OtterWise.

Let us look at the 4 options that will be provided to users.

Option 1: Annotations

We can create GitHub check annotations on a per-line basis without ever viewing or having access to code. This lets us provide a decent experience and quick overview, with no additional effort for the users. Enabling this can be done through the repository settings page.

Option 2: Downloadable coverage files

By letting you download the coverage file generated during CI, it can be imported into your supported IDE or code editor, such as PhpStorm, that can render a per-line diff.

Option 3: Browser Extension (COMING SOON)

A Chrome-browser extension that pulls coverage data from OtterWise when browsing GitHub. It will automatically display line coverage directly onto the file viewer there, effectively meaning OtterWise never has access to the code, but you still get all the benefits of viewing code coverage per line, including hits and more.

Option 4: Opt-in code access (COMING SOON)

Organisation admins can opt in to granting access to code, team members cannot opt in to the feature without admin access. Only team members with access to the repository will be able to access the code through OtterWise once the feature is enabled.

When a user attempts to view a file inside OtterWise, we will notify them that it is not possible without granting additional permissions. If they are an organisation admin, they will get the option to enable, and be taken through a GitHub App install flow for a separate app, which deals with code viewing. Once installed, users with access to the repository in the organisation, will be able to view code directly inside OtterWise. Revoking the special code-viewer app is simple, and will not impact the other functions of OtterWise, letting you opt out again easily with no downsides.

Installing this app DOES mean that a token will be persisted, with access to code. We follow the best security practices, however if your organisation doesn't want third-parties having such scopes granted, we suggest using options 1, 2 or 3, all of which completely cut off code access to private repositories. Please don't hesitate to reach out at [email protected] with your suggestions and feedback to further improve code privacy.

The Outcome

Implementing these changes, means that we can remove GitHub scopes that previously let us view code, and also simplify our ingress API to not have to strip away code anymore, as it is never accessed nor sent.

The only code-related data that reaches OtterWise servers is:

  • Line numbers
  • File names
  • Repository names
  • File coverage numbers

All uploader code, which runs in your CI environment, is open source and pulled directly from GitHub rather than our servers. This lets you inspect the code that is executed, and optionally hardcode your CI to a specific uploader SHA, to ensure it never changes.

The browser extension source code is also open source, giving you insights into how we get coverage data from our servers, and how we determine when and where to show it. Moreover, you can see what data is being sent to our servers (such as Repository, SHA, and file name being viewed, to give coverage data back to the browser)

Final Thoughts

While I see arguments for entirely ditching the File Viewer (viewing code inside OtterWise), I do believe some users will appreciate the simplicity of the option. It also lets us show public repository code for contributors, without them having to download a browser extension, since the code is publicly available.

A good compromise was found by letting organisation admins decide, while letting users still take advantage of code coverage tracking and per-line coverage information.

Improve code quality today_

With OtterWise, you can track test coverage, test performance, contributor stats, code health, and much more.