-s ,--src |
Input source directory. You must specify the target dir. |
None |
-mil ,--min-line |
Minimum number of lines that a code fragment must be to be treated as a clone. |
6 |
-mit ,--min-token |
Minimum number of tokens that a code fragment must be to be treated as a clone. |
50 |
-n ,--n-gram |
N for N-gram. |
5 |
-p ,--partition-size |
The number of partitions. |
10 |
-f ,--filtration-threshold |
Threshold used in the filtration phase (%). |
10 |
-v ,--verification-threshold |
Threshold used in the verificatioin phase (%). |
70 |
-o ,--output |
Output file name. |
result_{n}_{f}_{v}.csv |
-t ,--threads |
The number of threads used for parallel execution (both the Preprocess and Clone detection phases) |
all threads |
-l ,--language |
Target language |
java |
-bce ,--bigcloneeval |
If you specify -bce option, NIL outputs result file feasible to BigCloneEval. |
not specified |
-mif ,--mutationinjectionframework |
If you specify -mif option, NIL outputs nothing except for the output file name as standard output. |
not specified |
FSE ‘21 paper implementation (preprint)
DOI of submitted version executable file of NIL is
10.5281/zenodo.4492665
.NIL
NIL is a clone detector using N-gram, Inverted index, and LCS. NIL provides scalable large-variance clone detection.
Requirements
Install & Usage
git clone https://github.com/kusumotolab/NIL
)cd NIL
) and build NIL (./gradlew ShadowJar
)java -jar ./build/libs/NIL-all.jar [options]
)-bce
option, the format is/path/to/file_A,start_line_A,end_line_A,/path/to/file_B,start_line_B,end_line_B
.-bce
option, the format isdir_A,file_A,start_line_A,end_line_A,dir_B,file_B,start_line_B,end_line_B
Options
-s
,--src
-mil
,--min-line
6
-mit
,--min-token
50
-n
,--n-gram
5
-p
,--partition-size
10
-f
,--filtration-threshold
10
-v
,--verification-threshold
70
-o
,--output
result_{n}_{f}_{v}.csv
-t
,--threads
-l
,--language
java
-bce
,--bigcloneeval
-bce
option, NIL outputs result file feasible to BigCloneEval.-mif
,--mutationinjectionframework
-mif
option, NIL outputs nothing except for the output file name as standard output.Languages
java
.java
c
.c
,.h
cpp
.cpp
,.hpp
cs
,csharp
.cs
py
,python
.py
kt
,kotlin
.kt
If you execute NIL on 250-MLOC codebase, we recommend
-p
option to 135.Experiments, datasets, and baseline tools