INDEX
Explanations
expressions related to evaluation and performance metrics
New Auto-Interp
Head Attr Weights
0:0.03
1:0.02
2:0.38
3:0.06
4:0.10
5:0.03
6:0.06
7:0.07
8:0.05
9:0.03
10:0.05
11:0.06
Negative Logits
stip
-1.81
deduction
-1.58
issions
-1.56
administ
-1.55
reinstated
-1.52
rebate
-1.48
exchange
-1.43
rule
-1.40
sake
-1.39
Regist
-1.39
POSITIVE LOGITS
TextColor
1.68
jew
1.68
lite
1.66
achy
1.59
tan
1.47
blers
1.46
iren
1.45
Pie
1.44
bones
1.44
Haunted
1.43
Activations Density 0.109%