INDEX
Negative Logits
detriment
-0.11
azar
-0.10
imi
-0.10
roz
-0.09
iba
-0.09
sovere
-0.09
CONSTANTS
-0.09
eas
-0.09
ubar
-0.09
staples
-0.09
POSITIVE LOGITS
pointer
0.09
major
0.09
advoc
0.09
termin
0.09
rhetoric
0.09
ilar
0.09
folds
0.09
enlisted
0.09
prime
0.08
barn
0.08
Activations Density 0.153%