INDEX
Explanations
personal opinions and beliefs expressed with emotive language
New Auto-Interp
Negative Logits
littered
-0.60
Actions
-0.59
CI
-0.58
senal
-0.57
bount
-0.57
assault
-0.57
CI
-0.56
Rounds
-0.56
Flags
-0.55
rant
-0.55
POSITIVE LOGITS
afford
1.40
muster
1.08
feas
1.05
conceive
0.97
athom
0.96
cope
0.95
withstand
0.95
decipher
0.94
imagine
0.94
stomach
0.93
Activations Density 1.732%