INDEX
Explanations
grades and evaluations related to performance metrics
New Auto-Interp
Head Attr Weights
0:0.02
1:0.02
2:0.06
3:0.05
4:0.08
5:0.01
6:0.06
7:0.37
8:0.02
9:0.03
10:0.18
11:0.06
Negative Logits
roots
-1.63
vag
-1.57
hots
-1.55
forth
-1.54
メ
-1.52
fold
-1.48
requ
-1.46
ibus
-1.45
sharing
-1.45
mone
-1.45
POSITIVE LOGITS
grades
2.09
Grade
1.98
evaluations
1.85
evaluation
1.81
evaluating
1.75
erate
1.73
Ratings
1.72
Classification
1.72
outlook
1.68
graded
1.68
Activations Density 0.008%