-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathsample_output.html
More file actions
100 lines (100 loc) · 27.3 KB
/
sample_output.html
File metadata and controls
100 lines (100 loc) · 27.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
<html><body>
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1" />
<title></title>
<style type="text/css">
table.diff {font-family:Courier; border:medium;}
.diff_header {background-color:#e0e0e0}
td.diff_header {text-align:right}
.diff_next {background-color:#c0c0c0}
.diff_add {background-color:#aaffaa}
.diff_chg {background-color:#ffff77}
.diff_sub {background-color:#ffaaaa}
</style>
</head>
Algorithm 'NaiveDetector(), directory: /mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication, threshold: 0.8, granularity: functions' was used to get those results <br>
<table class="diff" id="difflib_chg_to0__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<thead><tr><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/tokenizer.py, similarity: 0.9411764705882353</th><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/tokenizer.py</th></tr></thead>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to0__1"><a href="#difflib_chg_to0__1">n</a></td><td class="diff_header" id="from0_1">1</td><td nowrap="nowrap">def get_<span class="diff_sub">fun</span>c<span class="diff_chg">tion</span>s_from_file(file: str, lang: str, identifiers_verbose: bool = False,</td><td class="diff_next"><a href="#difflib_chg_to0__1">n</a></td><td class="diff_header" id="to0_1">1</td><td nowrap="nowrap">def get_c<span class="diff_chg">lasse</span>s_from_file(file: str, lang: str, identifiers_verbose: bool = False,</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_2">2</td><td nowrap="nowrap"><span class="diff_sub"> </span> subtokenize: bool = False) -> List[ObjectData]:</td><td class="diff_next"></td><td class="diff_header" id="to0_2">2</td><td nowrap="nowrap"> subtokenize: bool = False) -> List[ObjectData]:</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_3">3</td><td nowrap="nowrap"> """</td><td class="diff_next"></td><td class="diff_header" id="to0_3">3</td><td nowrap="nowrap"> """</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to0__2">n</a></td><td class="diff_header" id="from0_4">4</td><td nowrap="nowrap"> Yield ObjectData objects for <span class="diff_sub">fun</span>c<span class="diff_chg">tion</span>s in a given file.</td><td class="diff_next"><a href="#difflib_chg_to0__2">n</a></td><td class="diff_header" id="to0_4">4</td><td nowrap="nowrap"> Yield ObjectData objects for c<span class="diff_chg">la</span>s<span class="diff_add">ses</span> in a given file.</td></tr>
<tr><td class="diff_next" id="difflib_chg_to0__2"></td><td class="diff_header" id="from0_5">5</td><td nowrap="nowrap"> :param file: the path to file.</td><td class="diff_next"></td><td class="diff_header" id="to0_5">5</td><td nowrap="nowrap"> :param file: the path to file.</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_6">6</td><td nowrap="nowrap"> :param lang: the language of the file.</td><td class="diff_next"></td><td class="diff_header" id="to0_6">6</td><td nowrap="nowrap"> :param lang: the language of the file.</td></tr>
<tr><td class="diff_next" id="difflib_chg_to0__3"></td><td class="diff_header" id="from0_7">7</td><td nowrap="nowrap"> :param identifiers_verbose: if True, will save not only identifiers themselves,</td><td class="diff_next"></td><td class="diff_header" id="to0_7">7</td><td nowrap="nowrap"> :param identifiers_verbose: if True, will save not only identifiers themselves,</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_8">8</td><td nowrap="nowrap"> but also their parameters as IdentifierData.</td><td class="diff_next"></td><td class="diff_header" id="to0_8">8</td><td nowrap="nowrap"> but also their parameters as IdentifierData.</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_9">9</td><td nowrap="nowrap"> :param subtokenize: if True, will split the tokens into subtokens.</td><td class="diff_next"></td><td class="diff_header" id="to0_9">9</td><td nowrap="nowrap"> :param subtokenize: if True, will split the tokens into subtokens.</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to0__3">n</a></td><td class="diff_header" id="from0_10">10</td><td nowrap="nowrap"> :return: an iterator of ObjectData objects for <span class="diff_sub">fun</span>c<span class="diff_chg">tion</span>s.</td><td class="diff_next"><a href="#difflib_chg_to0__3">n</a></td><td class="diff_header" id="to0_10">10</td><td nowrap="nowrap"> :return: an iterator of ObjectData objects for c<span class="diff_chg">lasse</span>s.</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_11">11</td><td nowrap="nowrap"> """</td><td class="diff_next"></td><td class="diff_header" id="to0_11">11</td><td nowrap="nowrap"> """</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to0__4">n</a></td><td class="diff_header" id="from0_12">12</td><td nowrap="nowrap"> if lang not in SUPPORTED_LANGUAGES["<span class="diff_sub">fun</span>c<span class="diff_chg">tion</span>s"]:</td><td class="diff_next"><a href="#difflib_chg_to0__4">n</a></td><td class="diff_header" id="to0_12">12</td><td nowrap="nowrap"> if lang not in SUPPORTED_LANGUAGES["c<span class="diff_chg">lasse</span>s"]:</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_13">13</td><td nowrap="nowrap"> raise ValueError(f"{lang} doesn't support gathering functions!")</td><td class="diff_next"></td><td class="diff_header" id="to0_13">13</td><td nowrap="nowrap"> raise ValueError(f"{lang} doesn't support gathering functions!")</td></tr>
<tr><td class="diff_next" id="difflib_chg_to0__4"></td><td class="diff_header" id="from0_14">14</td><td nowrap="nowrap"> file_data = TreeSitterParser.get_data_from_file(file, lang, gather_objects=True,</td><td class="diff_next"></td><td class="diff_header" id="to0_14">14</td><td nowrap="nowrap"> file_data = TreeSitterParser.get_data_from_file(file, lang, gather_objects=True,</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_15">15</td><td nowrap="nowrap"> gather_identifiers=False,</td><td class="diff_next"></td><td class="diff_header" id="to0_15">15</td><td nowrap="nowrap"> gather_identifiers=False,</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_16">16</td><td nowrap="nowrap"> identifiers_verbose=identifiers_verbose,</td><td class="diff_next"></td><td class="diff_header" id="to0_16">16</td><td nowrap="nowrap"> identifiers_verbose=identifiers_verbose,</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_17">17</td><td nowrap="nowrap"> subtokenize=subtokenize)</td><td class="diff_next"></td><td class="diff_header" id="to0_17">17</td><td nowrap="nowrap"> subtokenize=subtokenize)</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_18">18</td><td nowrap="nowrap"> for obj in file_data.objects:</td><td class="diff_next"></td><td class="diff_header" id="to0_18">18</td><td nowrap="nowrap"> for obj in file_data.objects:</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to0__top">t</a></td><td class="diff_header" id="from0_19">19</td><td nowrap="nowrap"> if obj.object_type == ObjectTypes.<span class="diff_sub">FUN</span>C<span class="diff_chg">TION</span>:</td><td class="diff_next"><a href="#difflib_chg_to0__top">t</a></td><td class="diff_header" id="to0_19">19</td><td nowrap="nowrap"> if obj.object_type == ObjectTypes.C<span class="diff_chg">LASS</span>:</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from0_20">20</td><td nowrap="nowrap"> yield obj</td><td class="diff_next"></td><td class="diff_header" id="to0_20">20</td><td nowrap="nowrap"> yield obj</td></tr>
</tbody>
</table><br>
<table class="diff" id="difflib_chg_to1__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<thead><tr><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/language_recognition/utils.py, similarity: 0.9166666666666666</th><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/parsing/utils.py</th></tr></thead>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to1__1"><a href="#difflib_chg_to1__1">n</a></td><td class="diff_header" id="from1_1">1</td><td nowrap="nowrap">def get_e<span class="diff_chg">n</span>r<span class="diff_sub">y</span>_dir() -> str:</td><td class="diff_next"><a href="#difflib_chg_to1__1">n</a></td><td class="diff_header" id="to1_1">1</td><td nowrap="nowrap">def get_<span class="diff_add">tr</span>e<span class="diff_chg">e_sitte</span>r_dir() -> str:</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from1_2">2</td><td nowrap="nowrap"> """</td><td class="diff_next"></td><td class="diff_header" id="to1_2">2</td><td nowrap="nowrap"> """</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to1__top">t</a></td><td class="diff_header" id="from1_3">3</td><td nowrap="nowrap"><span class="diff_sub"> Get the directory with Enry.</span></td><td class="diff_next"><a href="#difflib_chg_to1__top">t</a></td><td class="diff_header" id="to1_3">3</td><td nowrap="nowrap"><span class="diff_add"> Get tree-sitter directory.</span></td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from1_4">4</td><td nowrap="nowrap"> :return: absolute path.</td><td class="diff_next"></td><td class="diff_header" id="to1_4">4</td><td nowrap="nowrap"> :return: absolute path.</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from1_5">5</td><td nowrap="nowrap"> """</td><td class="diff_next"></td><td class="diff_header" id="to1_5">5</td><td nowrap="nowrap"> """</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from1_6">6</td><td nowrap="nowrap"> return os.path.abspath(os.path.join(os.path.dirname(__file__), "build"))</td><td class="diff_next"></td><td class="diff_header" id="to1_6">6</td><td nowrap="nowrap"> return os.path.abspath(os.path.join(os.path.dirname(__file__), "build"))</td></tr>
</tbody>
</table><br>
<table class="diff" id="difflib_chg_to2__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<thead><tr><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/subtokenizer.py, similarity: 0.8666666666666667</th><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/subtokenizer.py</th></tr></thead>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to2__2"><a href="#difflib_chg_to2__1">n</a></td><td class="diff_header" id="from2_1">1</td><td nowrap="nowrap"><span class="diff_sub">def stem_threshold(self, value):</span></td><td class="diff_next"><a href="#difflib_chg_to2__1">n</a></td><td class="diff_header" id="to2_1">1</td><td nowrap="nowrap"><span class="diff_add">def max_token_length(self, value):</span></td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from2_2">2</td><td nowrap="nowrap"> if not isinstance(value, int):</td><td class="diff_next"></td><td class="diff_header" id="to2_2">2</td><td nowrap="nowrap"> if not isinstance(value, int):</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to2__2">n</a></td><td class="diff_header" id="from2_3">3</td><td nowrap="nowrap"> raise TypeError("<span class="diff_sub">ste</span>m_th<span class="diff_sub">reshold</span> must be an integer - got %s" % type(value))</td><td class="diff_next"><a href="#difflib_chg_to2__2">n</a></td><td class="diff_header" id="to2_3">3</td><td nowrap="nowrap"> raise TypeError("m<span class="diff_add">ax</span>_t<span class="diff_add">oken_lengt</span>h must be an integer - got %s" % type(value))</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from2_4">4</td><td nowrap="nowrap"> if value < 1:</td><td class="diff_next"></td><td class="diff_header" id="to2_4">4</td><td nowrap="nowrap"> if value < 1:</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to2__top">t</a></td><td class="diff_header" id="from2_5">5</td><td nowrap="nowrap"> raise ValueError("<span class="diff_sub">ste</span>m_th<span class="diff_sub">reshold</span> must be greater than 0 - got %d" % value)</td><td class="diff_next"><a href="#difflib_chg_to2__top">t</a></td><td class="diff_header" id="to2_5">5</td><td nowrap="nowrap"> raise ValueError("m<span class="diff_add">ax</span>_t<span class="diff_add">oken_lengt</span>h must be greater than 0 - got %d" % value)</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from2_6">6</td><td nowrap="nowrap"><span class="diff_sub"> self._stem_threshold = value</span></td><td class="diff_next"></td><td class="diff_header" id="to2_6">6</td><td nowrap="nowrap"><span class="diff_add"> self._max_token_length = value</span></td></tr>
</tbody>
</table><br>
<table class="diff" id="difflib_chg_to3__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<thead><tr><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/subtokenizer.py, similarity: 0.8666666666666667</th><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/subtokenizer.py</th></tr></thead>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to3__2"><a href="#difflib_chg_to3__1">n</a></td><td class="diff_header" id="from3_1">1</td><td nowrap="nowrap"><span class="diff_sub">def stem_threshold(self, value):</span></td><td class="diff_next"><a href="#difflib_chg_to3__1">n</a></td><td class="diff_header" id="to3_1">1</td><td nowrap="nowrap"><span class="diff_add">def min_split_length(self, value):</span></td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from3_2">2</td><td nowrap="nowrap"> if not isinstance(value, int):</td><td class="diff_next"></td><td class="diff_header" id="to3_2">2</td><td nowrap="nowrap"> if not isinstance(value, int):</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to3__2">n</a></td><td class="diff_header" id="from3_3">3</td><td nowrap="nowrap"> raise TypeError("ste<span class="diff_chg">m_</span>th<span class="diff_sub">reshold</span> must be an integer - got %s" % type(value))</td><td class="diff_next"><a href="#difflib_chg_to3__2">n</a></td><td class="diff_header" id="to3_3">3</td><td nowrap="nowrap"> raise TypeError("<span class="diff_add">min_</span>s<span class="diff_add">pli</span>t<span class="diff_add">_l</span>e<span class="diff_chg">ng</span>th must be an integer - got %s" % type(value))</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from3_4">4</td><td nowrap="nowrap"> if value < 1:</td><td class="diff_next"></td><td class="diff_header" id="to3_4">4</td><td nowrap="nowrap"> if value < 1:</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to3__top">t</a></td><td class="diff_header" id="from3_5">5</td><td nowrap="nowrap"> raise ValueError("ste<span class="diff_chg">m_</span>th<span class="diff_sub">reshold</span> must be greater than 0 - got %d" % value)</td><td class="diff_next"><a href="#difflib_chg_to3__top">t</a></td><td class="diff_header" id="to3_5">5</td><td nowrap="nowrap"> raise ValueError("<span class="diff_add">min_</span>s<span class="diff_add">pli</span>t<span class="diff_add">_l</span>e<span class="diff_chg">ng</span>th must be greater than 0 - got %d" % value)</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from3_6">6</td><td nowrap="nowrap"><span class="diff_sub"> self._stem_threshold = value</span></td><td class="diff_next"></td><td class="diff_header" id="to3_6">6</td><td nowrap="nowrap"><span class="diff_add"> self._min_split_length = value</span></td></tr>
</tbody>
</table><br>
<table class="diff" id="difflib_chg_to4__top"
cellspacing="0" cellpadding="0" rules="groups" >
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
<thead><tr><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/subtokenizer.py, similarity: 0.8666666666666667</th><th class="diff_next"><br /></th><th colspan="2" class="diff_header">/mnt/c/Users/Yuriy Rogachev/PycharmProjects/code duplication detection/duplication/tokenizer/buckwheat/subtokenizer.py</th></tr></thead>
<tbody>
<tr><td class="diff_next" id="difflib_chg_to4__2"><a href="#difflib_chg_to4__1">n</a></td><td class="diff_header" id="from4_1">1</td><td nowrap="nowrap">def m<span class="diff_chg">ax</span>_t<span class="diff_sub">oken</span>_length(self, value):</td><td class="diff_next"><a href="#difflib_chg_to4__1">n</a></td><td class="diff_header" id="to4_1">1</td><td nowrap="nowrap">def m<span class="diff_chg">in</span>_<span class="diff_add">spli</span>t_length(self, value):</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from4_2">2</td><td nowrap="nowrap"> if not isinstance(value, int):</td><td class="diff_next"></td><td class="diff_header" id="to4_2">2</td><td nowrap="nowrap"> if not isinstance(value, int):</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to4__2">n</a></td><td class="diff_header" id="from4_3">3</td><td nowrap="nowrap"> raise TypeError("m<span class="diff_chg">ax</span>_t<span class="diff_sub">oken</span>_length must be an integer - got %s" % type(value))</td><td class="diff_next"><a href="#difflib_chg_to4__2">n</a></td><td class="diff_header" id="to4_3">3</td><td nowrap="nowrap"> raise TypeError("m<span class="diff_chg">in</span>_<span class="diff_add">spli</span>t_length must be an integer - got %s" % type(value))</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from4_4">4</td><td nowrap="nowrap"> if value < 1:</td><td class="diff_next"></td><td class="diff_header" id="to4_4">4</td><td nowrap="nowrap"> if value < 1:</td></tr>
<tr><td class="diff_next"><a href="#difflib_chg_to4__top">t</a></td><td class="diff_header" id="from4_5">5</td><td nowrap="nowrap"> raise ValueError("m<span class="diff_chg">ax</span>_t<span class="diff_sub">oken</span>_length must be greater than 0 - got %d" % value)</td><td class="diff_next"><a href="#difflib_chg_to4__top">t</a></td><td class="diff_header" id="to4_5">5</td><td nowrap="nowrap"> raise ValueError("m<span class="diff_chg">in</span>_<span class="diff_add">spli</span>t_length must be greater than 0 - got %d" % value)</td></tr>
<tr><td class="diff_next"></td><td class="diff_header" id="from4_6">6</td><td nowrap="nowrap"> self._m<span class="diff_chg">ax</span>_t<span class="diff_sub">oken</span>_length = value</td><td class="diff_next"></td><td class="diff_header" id="to4_6">6</td><td nowrap="nowrap"> self._m<span class="diff_chg">in</span>_<span class="diff_add">spli</span>t_length = value</td></tr>
</tbody>
</table><br> </body></html>