Small code clone detection tool. It implements an algorithm from SourcererCC with adaptive prefix filtering optimizations and displays its results as HTML.
It works with JavaScript, Python, Java, Go, C++, PHP, C#, C, Swift, Kotlin and Haskell.
potator supports Linux and macOS. It is possible to use potator on Windows under WSL
potator can be installed using pip
pip install potatorgit clone https://github.com/otzhora/potator
cd potator
./install.shpotator [-h] [-d {Naive,Filtering}] [--depth DEPTH] [-t THRESHOLD] [-g GRANULARITY] [-o OUT] directory - You can choose one of two detectors:
NaiveandFiltering.Naive detectorcompares every possible combination of source code fragments and calculates Jaccard similarity between them.Filtering detectorimplements algorithm fromSourcererCCpaper with anadaptive prefix filteringoptimizations. depthparameters specify the maximum depth of adaptive prefix.depth=2is recommended. Since it offers the optimal balance between costs of building index and querying it.thresholdis the minimum score that two code fragments should have to be considered clones.granularityspecifies granularity of code blocks. Options arefunctionsandclasses.functionsis recommended.outspecifies the name of the resulting htmldirectoryis the directory with files on which to perform search.
You can also do export DEBUG=1 before the search, then profiling information will be printed out.
You can import detectors or entities extractor from potator and use them to work with source code.
>>> from potator.detectors import FilteringDetector
>>> detector = FilteringDetector()
>>> detector.detect(directory, thershold, granularity)>>> from potator.extractors import EntitiesExtractor
>>> EntitiesExtractor.extract_data_from_directory(directory, granularity)