INDEX
Explanations
proper nouns related to political figures, locations, and events
the prefix "un" indicating negation or reversal
New Auto-Interp
Negative Logits
Ö¼
-0.68
accur
-0.67
ãĤ¼
-0.66
Madden
-0.65
sophistic
-0.63
defe
-0.63
Peb
-0.62
abiding
-0.62
ortment
-0.61
roomm
-0.59
POSITIVE LOGITS
geon
1.31
nel
1.21
geons
1.18
culus
1.12
ned
1.09
idad
1.05
ners
1.03
nels
1.01
iversity
0.98
culosis
0.97
Activations Density 0.036%