This project consists of a handful of scripts that support usage of the static analysis capabilities of HHVM within a PHP codebase. The scripts ultimately produce a formatted report of errors discovered in the analysis, but are also used to prepare the codebase and to directly compare reports.
- Clone this repository:
git clone git@github.com:wayfair/hussar - Edit
config.examplewith your project's details and save asconfig. - If necessary for your project, add a file named
local-patchto override the default functions provided inlib/patching. cdto the root of your PHP codebase- Run
/path/to/hussar/bin/main -F -r master - Run
less /path/to/hussar/reports/latestto view the report
Note:
This project requires
hhvm <= 3.3.x. The-t analyzeflag is not present in more recent versions of HHVM and there does not appear to be equivalent functionality for analyzing PHP code (see issue-3987).
hussar was developed in a GNU/Linux environment and uses many standard
utilities that we do not list here (eg, find, grep, rsync, mkfifo etc).
The following additional packages are required to use the tool:
hhvm <= 3.3.x(see note, above)bash >= 4.0(for associative array support)git(not required for all scripts, see Usage section below)php(for building the.class-indexand parsing HHVM output)perl(for patching files in place)
PHP projects that are unable to run directly on HHVM can still take advantage of identifying errors found by its strong typing and static analysis of code. Unfortunately the output provided by HHVM isn't so friendly, and some prep work is required before analysis can be performed at all since the static analyzer lacks support for some common PHP functionality (eg, autoloading). The purpose of this project is to facilitate use of the static analysis capabilities of HHVM so that it can be included as part of a regular test, build, and deploy cycle.
The scripts under bin/ are designed to be able to run independently, with the
bin/main script wrapping the functionality of the others. For typical PHP
projects using git, try the following commands:
# Commands expect to run within the root of the codebase
$ cd /path/to/codebase
# Analyze all files in the codebase
$ /path/to/hussar/bin/main -F
# View the latest full report
$ less /path/to/hussar/reports/latestProjects are likely to have a number of errors in their codebase and developers shouldn't have to see the full list every time. The tool can restrict output to just those errors that might be introduced by a developer's changes:
# Analyze for errors in a patchfile (compared to origin/master)
$ /path/to/hussar/bin/main -p /path/to/file.patch
# Analyze for errors in a branch (compared to merge-base with origin/master)
$ /path/to/hussar/bin/main -r feature/branch -t 'fork-point'Options can be combined to handle more complex cases:
# Analyze file.patch applied to feature/branch and compare to origin/master:
$ /path/to/hussar/bin/main -p /path/to/file.patch -r feature/branch
# As above, but compare to feature/base instead:
$ /path/to/bin/main -p /path/to/file.patch -r feature/branch -t feature/baseFor these more complex uses, it can be helpful to enable debugging output:
# Enable debugging messages
$ /path/to/hussar/bin/main -d -p /path/to/file.patchProjects not using git can still take advantage of the tool. Patch the
codebase manually (see next section) and run the bin/make-report command:
# Make report for default branch
$ hg up default
$ /path/to/hussar/bin/make-report -p -r $(hg id)The bin/make-diff-report script should easily translate to other source
control systems but is not currently supported outside git. The other scripts
are more tightly coupled to git commands, but it should be possible to
translate them as well.
Being static analysis, HHVM requires a few codebase modifications to support
some standard PHP features and make the results of the analysis meaningful.
Applying these modifications is the responsibility of the bin/patch-files
script, which modifies files to correct for the following limitations:
-
Autoprepend and Autoloading:
The
auto_prepend_filedirective is not supported when runninghhvmwith the-t analyzeflag (see issue-2990). Autoloading is also unsupported (see issue-253) and classes meant to be included viaspl_autoload_registerorhh\autoload_set_pathsare instead reported asUnknownClass. These limitations result in a cascade of false-positive errors for any files that depend on those features. To resolve these issues, the script inserts hardrequire_oncestatements into every file, using a regex to identify what classes a file contains. -
PHP incompatibilities and third-party packages:
The HHVM team has done a great job reimplementing PHP functionality, but there are still a number of incompatibilities beyond the two described above, and often projects use third-party code that can clutter reports. When present, the script adds
require_oncestatements for all files under thestubs/directory to theauto_prepend.phpscript so that these objects are always defined. -
Runtime defined state:
The whole point of static analysis is to discover errors without executing the program, so it's no surprise that HHVM emits some false-positive errors for states known only at runtime. Constants whose values are defined within conditional expressions produce
DeclaredConstantTwiceerrors when in reality only one branch would be taken at runtime. HHVM does not store the value of conditionally defined constants, so concatenating such constants with strings (as inrequire_once PROJECT_ROOT . '/path/to/file.php';) results inUnknownClassorUndefinedFunctionerrors. The script can be configured to patch such constants to static string values.
Patching and analyzing a large codebase can take several minutes, a delay that
discourages developers from using the tool to avoid interrupting their workflow.
While hussar can't do much to improve the run time of hhvm, it employs a few
caches to reduce the time required to patch files. Both caches are created by
default during the first invocation of hussar.
The first cache is the .class-index, which is a simple mapping from a fully
qualified class name to its corresponding file. This is encoded as a Bash
associative array and sourced before patching files. Without this cache, class
files must be "autoloaded" at patch time, which is very expensive because it
requires spinning up a php process for every file to be patched.
# Make the .class-index file
$ /path/to/hussar/bin/main -cThe second cache is the patch cache, which is a full copy of the codebase after
having patched all files. This cache is built via rsync and cuts the time
required to patch files down to a few seconds by reducing the number of files to
be patched to just those changed or introduced in a revision or patchfile.
# Fully patch the codebase and update the patch cache to match
$ /path/to/bin/main -F -uAs described in the Usage section above, hussar tries to be flexible with
what can be analyzed. The bin/prepare-workspace and bin/patch-files scripts
work together to ensure that the report is actually performed against the
specified revisions. Getting this right is tricky and requires keeping track of
the hashes for origin/master, the patch cache, the revision, the comparand,
and the patchfile (modulo which options are present).
The former script maintains the workspace as necessary by resetting HEAD and
pulling down the patch cache, then replacing the cached versions of files with
their equivalent version in the specified revision or patchfile. Patchfiles are
then committed so that bin/make-report can save the report under the new
revision. If the patch cache is ahead of the specified revision, files not
present in the revision are removed. If the patch cache is behind, files added
in the revision are checked out to the index as new files. Upon completion, all
files not modified by the revision or patchfile are modified in the working tree
to their patch cache equivalents; that is, the workspace is identical to the
revision (or patchfile applied to the revision) but with patched versions of
nearly all files.
The latter script patches any files deemed necessary to patch (as described
above in Limitations of HHVM static analysis). Passing the -F option
specifies all files in the codebase. When a revision is specified, only the
files modified between HEAD and the revision are patched. This is precisely
where bin/prepare-workspace leaves off, so that by the end of these two
scripts all files in the codebase have been patched while otherwise reflecting
their contents at the specified revision (plus patchfile).
hussar is distributed with an ISC License. See LICENSE for details.