INDEX
Explanations
verbs or phrases related to making decisions or recommendations
statements that indicate stated opinions or beliefs
New Auto-Interp
Negative Logits
isal
-0.75
cation
-0.70
mania
-0.62
asus
-0.60
panic
-0.60
ommel
-0.59
aml
-0.59
lance
-0.59
tery
-0.59
etheless
-0.58
POSITIVE LOGITS
themselves
1.19
selves
0.97
selves
0.84
helmets
0.68
jointly
0.68
individually
0.66
uniforms
0.65
mouths
0.64
microphones
0.64
aughtered
0.64
Activations Density 0.980%