INDEX
Explanations
phrases indicating knowledge or information
phrases emphasizing knowledge or awareness
New Auto-Interp
Negative Logits
ItemTracker
-0.79
oples
-0.70
phrine
-0.67
isco
-0.67
voucher
-0.66
issions
-0.65
pex
-0.64
privatization
-0.64
Gators
-0.63
otion
-0.63
POSITIVE LOGITS
lege
1.25
ledge
1.17
ledged
1.05
LED
0.82
LES
0.76
SOURCE
0.76
cut
0.74
beforehand
0.74
ABOUT
0.74
nothing
0.73
Activations Density 0.060%