- About this document
- Getting the code
- Running
dbt-impalain development - Testing
- Submitting a Pull Request
This document is a guide for anyone interested in contributing to the dbt-impala repository. It outlines how to create issues and submit pull requests (PRs).
This is not intended as a guide for using dbt-impala in a project.
We assume users have a Linux or MacOS system. You should have familiarity with:
- Python
virturalenvs - Python modules
pip- common command line utilities like
git.
In addition to this guide, we highly encourage you to read the dbt-core. Almost all information there is applicable here!
git is needed in order to download and modify the dbt-impala code. There are several ways to install Git. For MacOS, we suggest installing Xcode or Xcode Command Line Tools.
If you are not a member of the Cloudera GitHub organization, you can contribute to dbt-impala by forking the dbt-impala repository. For more on forking, check out the GitHub docs on forking. In short, you will need to:
- fork the
dbt-impalarepository - clone your fork locally
- check out a new branch for your proposed changes
- push changes to your fork
- open a pull request of your forked repository against
cloudera/dbt-impala
If you are a member of the Cloudera GitHub organization, you will have push access to the dbt-impala repo. Rather than forking dbt-impala to make your changes, clone the repository like normal, and check out feature branches.
-
Ensure you have the latest version of
pipinstalled by runningpip install --upgrade pipin terminal. -
Configure and activate a
virtualenvas described in Setting up an environment. -
Install
dbt-corein the activevirtualenv. To confirm you installed dbt correctly, rundbt --versionandwhich dbt. -
Install
dbt-impalaand development dependencies in the activevirtualenv. Runpip install -e . -r dev-requirements.txt. -
Add the pre-commit hook. Run
pre-commit install
When dbt-impala is installed this way, any changes you make to the dbt-impala source code will be reflected immediately (i.e. in your next local dbt invocation against a Impala target).
dbt-impala contains functional tests. Functional tests require an actual Impala warehouse to test against.
- You can run functional tests "locally" by configuring a
test.envfile with appropriateENVvariables. - To run
Kudu functional testsas part of the test suite when underlying storage isKudu, please set theENVvariableDISABLE_KUDU_TESTtofalse. Kudu tests are disabled by default as thisENVvariable is set to true.
cp test.env.example test.env
$EDITOR test.env
WARNING: The parameters in your test.env file must link to a valid Impala instance. The test.env file you create is git-ignored, but please be extra careful to never check in credentials or other sensitive information when developing.
There are a few methods for running tests locally.
tox takes care of managing Python virtualenvs and installing dependencies in order to run tests.
To Run individual test:
make test TESTS=tests/functional/adapter/test_basic.py::TestSimpleMaterializationsImpalaTo Run individual test for a specific python version:
make test TESTS=tests/functional/adapter/test_basic.py::TestSimpleMaterializationsImpala PYTHON_VERSION=py38
To Run tests across all version of python:
make test_all_python_versions TESTS=tests/functional/adapter/test_basic.py::TestSimpleMaterializationsImpalaThe configuration of these tests are located in tox.ini.
NOTE:
- Python versions for which you are running tests have to be installed on your machine manually.
- To configure the pytest setting, update pytest.ini. By default, all the tests run logs are captured in
logs/<test-run>/dbt.log
You may run a specific test or group of tests using pytest directly. Activate a Python virtualenv active with dev dependencies installed. Use the appropriate profile like cdh_endpoint or dwx_endpoint. Then, run tests like so:
# Note: replace $strings with valid names
# run full tests suite against an environment/endpoint
python -m pytest --profile dwx_endpoint
# run all impala functional tests in a directory
python -m pytest tests/functional/$test_directory --profile dwx_endpoint
python -m pytest tests/functional/adapter/test_basic.py --profile dwx_endpoint
# run all impala functional tests in a module
python -m pytest --profile dwx_endpoint tests/functional/$test_dir_and_filename.py
python -m pytest --profile dwx_endpoint tests/functional/adapter/test_basic.py
# run all impala functional tests in a class
python -m pytest --profile dwx_endpoint tests/functional/$test_dir_and_filename.py::$test_class_name
python -m pytest --profile dwx_endpoint tests/functional/adapter/test_basic.py::TestSimpleMaterializationsImpala
# run a specific impala functional test
python -m pytest --profile dwx_endpoint tests/functional/$test_dir_and_filename.py::$test_class_name::$test__method_name
python -m pytest --profile dwx_endpoint tests/functional/adapter/test_basic.py::TestSimpleMaterializationsImpala::test_base
# run a specific unit test
python3 -m pytest tests/unit/test_exceptions.pyTo configure the pytest setting, update pytest.ini. By default, all the tests run logs are captured in logs/<test-run>/dbt.log
A dbt-impala maintainer will review your PR and will determine if it has passed regression tests. They may suggest code revisions for style and clarity, or they may request that you add unit or functional tests. These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code.
Once all tests are passing and your PR has been approved, a dbt-impala maintainer will merge your changes into the active development branch. And that's it! Happy developing 🎉