Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End-To-End Dokumentation and cleanup #645

Merged
merged 5 commits into from
Sep 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 136 additions & 68 deletions jplag.endtoend-testing/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# JPlag - End To End Testing

# JPlag - End-To-End Testing
With the help of the end-to-end module, changes to the detection of JPlag are to be tested.
With the help of elaborated plagiarisms, which have been worked out from suggestions in the literature on the topic of "plagiarism detection and avoidance", a wide range of detectable change can be covered. The selected plagiarisms are the decisive factor here as to whether a change in recognition can be perceived.
With the help of elaborated plagiarism, which has been worked out from suggestions in the literature on the topic of "plagiarism detection and avoidance", a wide range of detectable changes can be covered. The selected plagiarisms are the decisive factor here as to whether a change in recognition can be perceived.

## References
These elaborations provide basic ideas on how a modification of the plagiarized source code can look like or be adapted.
These elaborations provide basic ideas on how a modification of the plagiarized source code can look or be adapted.
These code adaptations refer to a wide range of changes starting from
adding/removing comments to architectural changes in the deliverables.

Expand All @@ -18,106 +19,122 @@ The following changes were applied to sample tasks to create test cases:
<li>Inserting comments or empty lines (normalization level)</li>
<li>Changing variable names or function names (normalization level)</li>
<li>Insertion of unnecessary or changed code lines (token generation)</li>
<li>Changing the program flow (token generation) (statments and functions must be independent from each other)</li>
<ul>
<li>Variable decleration at the beginning of the program</li>
<li>Combining declerations of variables</li>
<li>Changing the program flow (token generation) (statments and functions must be independent of each other)</li>
<ul type="1">
<li>Variable declaration at the beginning of the program</li>
<li>Combining declarations of variables</li>
<li>Reuse of the same variable for other functions</li>
</ul>
<li>Changing control structures</li>
<ul>
<ul type="1">
<li>for(...) to while(...)</li>
<li>if(...) to switch-case</li>
</ul>
<li>Modification of expressions</li>
<ul>
<ul type="1">
<li>(X < Y) to !(X >= Y) and ++x to x = x + 1</li>
</ul>
<li>Splitting and merging statements</li>
<ul>
<ul type="1">
<li>x = getSomeValue(); y = x- z; to y = (getSomeValue() - Z;</li>
</ul>
<li>Inserting unnecessary casts</li>
</ul>

More detailed information about the create as well as about the subject to the issue can be found in the issue [Develop an end-to-end testing strategy](https://github.com/jplag/JPlag/issues/193 "Develop an end-to-end testing strategy").
More detailed information about the creation as well as about the subject of the issue can be found in the issue [Develop an end-to-end testing strategy](https://github.com/jplag/JPlag/issues/193 "Develop an end-to-end testing strategy").

**The changes listed above have been developed and evaluated for purely scientific purposes and are not intended to be used for plagiarism in the public or private domain.**

Software is according to [§ 2 of the copyright law](https://www.gesetze-im-internet.de/urhg/__2.html "§ 2 of the copyright law") a protected work which may not be plagiarized.
Software is according to [§ 2 of the copyright law](https://www.gesetze-im-internet.de/urhg/__2.html "§ 2 of the copyright law") a protected work that may not be plagiarized.

## JPlag - End To End TestSuite Structure
The construction of an end to end test is done with the help of the JPlag api.
The tests are generated dynamically according to the existing test data and allow the creation of endToEnd tests for all supported languages of JPlag without having to make any changes to the code.
## JPlag - End-To-End TestSuite Structure
The construction of an end-to-end test is done with the help of the JPlag api.
The tests are generated dynamically according to the existing test data and allow the creation of end-to-end tests for all supported languages of JPlag without having to make any changes to the code.
The helper loads the existing test data from the designated directory and creates dynamic tests for the individual directories. It is therefore possible to create different test classes for the different languages.
- JPlagTestSuiteHelper:

``` java
```JAVA
public static Map<LanguageOption, Map<String, Path>> getAllLanguageResources()
```

The list of languages created in this way and their associated data are dynamically generated into test cases in the Test Suite.

``` java
```JAVA
Collection<DynamicTest> dynamicOverAllTest()
```

In order to be able to distinguish in which domain of the recognition changes have occurred, fine granular test cases are used. These are composed of the changes already mentioned above. The plagiarism is compared with the original delivery and thus it is possible to detect and test small sections of the recognition.
To be able to distinguish in which domain of the recognition changes have occurred, fine granular test cases are used. These are composed of the changes already mentioned above. The plagiarism is compared with the original delivery and thus it is possible to detect and test small sections of the recognition.

The comparative values were discussed and tested. The following results of the JPlag scan are used for the comparison:
1. minimal similarity as `float`
2. maximum similarity as `float`
3. matched token numbe as `int`
3. matched token number as `int`

The comparative values were disscussed and elaborated in the issue [End to end testing - "comparative values"](https://github.com/jplag/JPlag/issues/548 "End to end testing - \"comparative values\"").
The comparative values were disscussed and elaborated in the issue [End-to-end testing - "comparative values"](https://github.com/jplag/JPlag/issues/548 "End-to-end testing - \"comparative values\"").

Additionally it is possible to create several options for the test data. More information about the test options can be found at [JPlag - option variants for the endToEnd tests #590](https://github.com/jplag/JPlag/issues/590 "JPlag - option variants for the endToEnd tests #590"). Currently, various settings are supported by the `minimumTokenMatch`. This can be extended as desired in the record class `Options`.
Additionally, it is possible to create several options for the test data. More information about the test options can be found at [JPlag - option variants for the end-to-end tests #590](https://github.com/jplag/JPlag/issues/590 "JPlag - option variants for the end-to-end tests #590"). Currently, various settings are supported by the `minimumTokenMatch`. This can be extended as desired in the record class `Options`.

The current JPlag scans will be compared with the stored ones.
This was done by storing the data in a *.json file which is read at the beginning of each test run.

``` json
[...]
{
"options" : {
"minimum_token_match" : 1
},
"tests" : {
"SortAlgo-SortAlgo5" : {
"minimal_similarity" : 82.14286,
"maximum_similarity" : 82.14286,
"matched_token_number" : 46
},
"SortAlgo-SortAlgo6" : {
"minimal_similarity" : 83.58209,
"maximum_similarity" : 100.0,
"matched_token_number" : 56
},
"SortAlgo-SortAlgo7" : {
"minimal_similarity" : 96.42857,
"maximum_similarity" : 100.0,
"matched_token_number" : 54
},
[...]
### JSON Result Structure

The structures of the Json file can be traced using the individual record classen which can be found under `de.jplag.endtoend.model`.
The outer structure of the JSON file is recorded in the `ResultDescription` record.
The record contains a map of several options and the corresponding results.
The internal structure consists of several `Option` records, each of which contains information about the current configuration for the test run.
Thus the results can be kept apart from the other configurations.
The test results for the specified options are also specified in the object. This consists of the `ExpectedResult` record which contains the results of the detection.

Here the herachie is as follows:

```JSON
[{
"options":{
"minimum_token_match":"int"
},
"tests":{
"languageIdentifier":{
"minimal_similarity":"float",
"maximum_similarity":"float",
"matched_token_number":"int"
},
"/..."
}
},
"options":{
"minimum_token_match":"int"
},
"tests":{
"languageIdentifier":{
"minimal_similarity":"float",
"maximum_similarity":"float",
"matched_token_number":"int"
},
{
"/..."
}
}
}]
```

---

## Create New Language End To End Tests
## Create New Language End-To-End Tests

This section explains how to create new end to end tests in the existing test suite.
This section explains how to create new end-to-end tests in the existing test suite.
### Creating The Plagiarism
Before you add a new language to the end to end tests i would like to point out that the quality of the tests depends dreadfully on the plagiarism techniques you choose wicht were explaint in sechtion [Steps Towards Plagiarism](#steps-towards-plagiarism).
Before you add a new language to the end-to-end tests I would like to point out that the quality of the tests depends dreadfully on the plagiarism techniques you choose which were explained in section [Steps Towards Plagiarism](#steps-towards-plagiarism).
If you need more information about the creation of plans for this purpose, you can also read the elaborations that can be found under [References](#references).
The more and varied changes you apply, the more accurate the end-to-end tests for the language will be.
The more varied changes you apply, the more accurate the end-to-end tests for the language will be.

In the following an example is shown which is in the JavaEndToEnd tests and is used.

**Changing control structures for(…) to while(…):**

``` java
```JAVA
//base class
public class SortAlgo {
//[...]
//...
public void BubbleSortWithoutRecursion(Integer arr[]) {
for(int i = arr.length; i > 1 ; i--) {
for(int innerCounter = 0; innerCounter < arr.length-1; innerCounter++)
Expand All @@ -128,14 +145,14 @@ public void BubbleSortWithoutRecursion(Integer arr[]) {
}
}
}
//[...]
//...
}
```

``` java
```JAVA
//created plagiarism
public class SortAlgo5{
//[...]
//...
public void BubbleSortWithoutRecursion(Integer arr[]) {
int i = arr.length;
while(i > 1)
Expand All @@ -148,27 +165,78 @@ public void BubbleSortWithoutRecursion(Integer arr[]) {
}
innerCounter++;
}
i--;
I--;
}
}
//[...]
//...
}
```
### Copying Plagiarism To The Resources

The plagiarisms created in [Creating The Plagiarism](#creating-the-plagiarism) must now be copied to the corresponding resources folder. It is important not to mix the languages of the plagiarisms or to copy the data into bottle resource paths.

- At the path `JPlag\jplag.endToEndTesting\src\test\resources\languageTestFiles` a new folder for the language should be created if it does not already exist. For example `[...]\resources\languageTestFiles\JAVA`. If you have plagiarized several different code samples, you can also create additional subfolders under the newly created folder for example `[...]\resources\languageTestFiles\JAVA\sortAlgo`.
- At the path `JPlag/jplag.endToEndTesting/src/test/resources/languageTestFiles` a new folder for the language should be created if it does not already exist. For example `[...]/resources/languageTestFiles/JAVA`. If you have plagiarized several different code samples, you can also create additional subfolders under the newly created folder for example `[...]/resources/languageTestFiles/JAVA/sortAlgo`.

It is important to note that the resource folder name must be the same as the language identifier name in JPlag/Language.
Otherwise, the language option cannot be parsed correctly to the enum-type which you can found in every language module under `Language.java` `IDENTIFIER`

Once the tests have been run for the first time, the information for the tests is stored in the folder `../target/testing-directory-submission/LANGUAGE`. This data can be copied to the path `[...]/resources/results/LANGUAGE`. Each subdirectory gets its result JSON file as `[...]/resources/results/LANGUAGE/TEST_SUITE_NAME.json`. Once the test data has been copied, the end-to-end tests can be successfully tested. As soon as a change in the detection takes place, the results will differ from the stored results and the tests will fail if the results have changed.

### Extending The Comparison Value

As already described, the current comparisons in the end-to-end test treat the values of `minimal similarity`, `maximum similarity`, and `matched token number`.
As soon as there is a need to extend these comparison values, this section describes how this can be achieved.
Beforehand, however, this should be discussed in a new issue about this need.

It is important to note that the resource folder name must be exactly the same as the language identifier name in JPlag/Language. Otherwise the language option cannot be parsed correctly to the enum-type.
- c++ with "cpp"
- c# with "csharp"
- GO with "go"
- Java with "java"
- Kotlin with "kotlin"
- Python3 with "python3"
- R with "rlang"
- Rust with "rust"
- Scala with "scala"
- For new comparison values these properties must be extended in the `ExpectedResult` record at the package `de.jplag.endtoend.model`. Here it is sufficient to add the values in the record and to enter the JSON name as `@JsonProperty("json_name")`.

```JAVA
public record ExpectedResult(
@JsonProperty("minimal_similarity") float resultSimilarityMinimum,
@JsonProperty("maximum_similarity") float resultSimilarityMaximum,
@JsonProperty("matched_token_number") int resultMatchedTokenNumber) {
}
```

- To be able to include the new value in the tests, they must be added to the `EndToEndSuiteTest` as a comparison operation at the package `de.jplag.endtoend`. The `runJPlagTestSuite()` function provided for this purpose must be extended to include the new comparison value. To do this, create the comparison as shown in the code example below.

```JAVA
//...
if (Float.compare(result.resultSimilarityMaximum(), jPlagComparison.maximalSimilarity()) != 0) {
addToValidationErrors("maximalSimilarity", String.valueOf(result.resultSimilarityMaximum()),
String.valueOf(jPlagComparison.maximalSimilarity()));
}
//...
```

- Once the tests run the first time they will fail due to the missing values in the old JSON result file used for the test cases. The old results must then be replaced with new ones.
For this purpose, the last section of the chapter [Copying Plagiarism To The Resources](#copying-plagiarism-to-the-resources) can be used as help.

### Extending JPlar Test Run Options
The endToEnd tests support the possible scan options of the JPlag API. Currently `minimumTokenMatch` is used in the end-to-end tests. These values are also stored in the JSON as configuration to keep the test cases at the options apart. Likewise, also changes in the logic of the different options are to be determined to be able.

- To extend new options to the end-to-end tests they have to be added to the record object `Options` in the package `de.jplag.endtoend.model`. Here it is sufficient to add the values in the record and to enter the JSON name as `@JsonProperty("json_name")`.

```JAVA
public record Options(
@JsonProperty("minimum_token_match") Integer minimumTokenMatch) {
}
```

- After the new value has been added to the record, the creation of the object must now also be adjusted in the `EndToEndSuiteTest`. The 'setRunOptions' function is provided for this purpose. The options can be added in any order and combination. It should be noted that each test case is run with these options.

```JAVA
private void setRunOptions() {
options = new ArrayList<>();
options.add(new Options(1));
options.add(new Options(15));
}
```

- If you want to create individual test cases by testing the options only on a specific set of dates, a new test case must be created for this purpose. For the new test cases, the options in the transfer parameters can be adjusted and specified. This can then be tested with the function `runTests`.
```JAVA
runTests(directoryName, option, currentLanguageIdentifier, testCase, currentResultDescription);
```

Once the tests have been run for the first time, the information for the tests is stored in the folder `..\target\testing-directory-submission\LANGUAGE`. This data can be copied to the path `[...]\resources\results\LANGUAGE`. Each subdirectory gets its own result json file as `[...]\resources\results\JAVA\sortAlgo.json`. Once the test data has been copied, the endToEnd tests can be successfully tested. As soon as a change in the detection takes place, the results will differ from the stored results and the tests will fail if the results have changed.
- Once the tests run the first time they will fail due to the missing values in the old JSON result file used for the test cases. The old results must then be replaced with new ones.
For this purpose, the last section of the chapter [Copying Plagiarism To The Resources](#copying-plagiarism-to-the-resources) can be used as help.
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ public EndToEndSuiteTest() throws IOException {
*/
private void setRunOptions() {
options = new ArrayList<>();
options.add(new Options(1));
options.add(new Options(15));
options.add(new Options(9));
options.add(new Options(3));
}

/**
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
public class SortAlgo2 {
public class SortAlgo4 {
private int firstCounter;
private int arrayLenght;
private int swapVarI;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,18 @@ public void BubbleSortRecursion(Integer arr[], int n) {
return;

for (int i = 0; i < n - 1; i++) {
if (arr[i] > arr[i + 1]) {
swap(arr, i, i + 1);
if (arr[i] > arr[add(i , 1)]) {
swap(arr, i, add(i , 1));
}
}
BubbleSortRecursion(arr, n - 1);
}

public void BubbleSortWithoutRecursion(Integer arr[]) {
for (int i = arr.length; i > 1; i--) {
for (int innerCounter = 0; innerCounter < arr.length - 1; innerCounter++) {
if (arr[innerCounter] > arr[innerCounter + 1]) {
swap(arr, innerCounter, (innerCounter + 1));
for (int innerCounter = 0; innerCounter < subtract(arr.length, 1); innerCounter++) {
if (arr[innerCounter] > arr[add(innerCounter , 1)]) {
swap(arr, innerCounter, add(innerCounter , 1));
}
}
}
Expand All @@ -30,4 +30,14 @@ private final <T> void swap(T[] arr, int i, int j) {
arr[i] = arr[j];
arr[j] = t;
}

private int add(int value1, int value2)
{
return value1 + value2;
}

private int subtract(int value1, int value2)
{
return value1 - value2;
}
}
Loading