INDEX
Explanations
proper nouns and locations
variations of the suffix '-ev' or similar endings in words
New Auto-Interp
Negative Logits
ngth
-0.69
TRY
-0.63
Heard
-0.62
Pwr
-0.61
Amnesty
-0.60
Danger
-0.57
MacArthur
-0.57
catentry
-0.56
words
-0.56
ports
-0.55
POSITIVE LOGITS
irus
1.19
irtual
1.08
ille
1.01
ideos
0.99
iral
0.99
ideo
0.99
enture
0.97
ski
0.96
oice
0.94
incial
0.94
Activations Density 0.070%