INDEX
Explanations
subjective opinions and predictions about future events
New Auto-Interp
Negative Logits
.CommandType
-0.15
elay
-0.15
UIL
-0.15
arna
-0.15
IMS
-0.15
Pear
-0.14
angers
-0.14
imiter
-0.14
innocent
-0.14
vil
-0.14
POSITIVE LOGITS
predicting
0.18
pencil
0.17
handic
0.17
penc
0.17
predicts
0.16
otch
0.16
award
0.16
expects
0.16
progn
0.16
peg
0.15
Activations Density 0.068%