merged_java.csv contains an extended version of the Java dataset.
The manually classified Java comments present in "Classifying code comments in Java open-source software systems" have been merged with the NLBSE'23 Tool Competition Java dataset.
A conservative approach has been adopted to remove duplicates during the merging process, in particular, given only an overlap in files of 10%. Those files have been removed from the newly considered dataset. This approach also leaves the original NLBSE'23 Tool Competition Java dataset untouched.
extended_java
Directory actions
More options
Directory actions
More options
extended_java
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
parent directory.. | ||||