An example dataset of Breiman's variable importance scores
A dataset containing software metrics of 1,000 calculation of Breiman's variable importance scores
data
Format
A data frame with 1,000 rows and 27 variables:
- Avg_CloneLineCount: An average physical lines of clone siblings of a clone.
- Avg_CountLineComment: An average comment lines in the methods that contain clone siblings of a clone.
- Avg_Cyclomatic: McCabe Cyclomatic complexity of the method that contains the clone.
- Avg_ImproveCommitCount: Number of commits that impact the method containing the clone.
- Avg_LineAdded: Number of lines added into the method that contains the clone.
- Avg_LineCodeCount: Number of source code lines in the method that contains the clone.
- Avg_MaxNesting: Maximum nesting level of control constructs in the method that contains the clone.
- Avg_NewFeatureCommitCount: Number of commits that introduce new feature and that impact the method containing the clone.
- Avg_RatioCommentToCode: Ratio of CommentLineCount to LineCodeCount.
- Avg_RatioLineCodeCount: Ratio of LineCount to CloneLineCount.
- Avg_TokenCount: Number of tokens in the clone.
- CloneType: Type of clone class to which the clone belongs.
- Diff_CloneLineCount: Number of physical lines in the clone.
- Diff_CountLineComment: Number of comment lines in the method that contains the clone.
- Diff_Cyclomatic: McCabe Cyclomatic complexity of the method that contains the clone.
- Diff_DeveloperCount: Number of distinct developers who modified the method that contains the clone.
- Diff_Essential: Numberical measure of structuredness of the method that contains the clone.
- Diff_FanIn: Number of unique methods that call the method containg the clone.
- Diff_FanOut: Number of unique methods that are called by the method containing the clone.
- Diff_FixCommitCount: Number of commits with a description of fixing bugs and that impact the method containing the clone.
- Diff_LineCodeDeclCount: Number of declarative source code lines in the method that contains the clone.
- Diff_LineCount: Number of lines in the method that contains the clone.
- Diff_LineDeleted: Number of lines deleted from the method that contains the clone.
- Diff_NewFeatureCommitCount: Number of commits that introduce new feature and that impact the method containing the clone.
- Diff_TokenCount: Number of tokens in the clone.
- Max_DirectoryDistance: Number of directories that are traversed from the method containing one sibling to the method containing another sibling of the clone.
- SiblingCount: Number of clone siblings in the clone.
Source
https://github.com/klainfo/ScottKnottESD/
maven