INDEX
Explanations
names of geographical locations
proper nouns and names related to individuals and organizations
New Auto-Interp
Negative Logits
ulhu
-0.74
Archdemon
-0.71
arcity
-0.61
roach
-0.61
behind
-0.60
MAC
-0.60
healthy
-0.59
OPLE
-0.59
ournals
-0.58
rers
-0.58
POSITIVE LOGITS
eton
0.81
etric
0.75
achev
0.73
ibrary
0.70
kov
0.70
tes
0.69
ר
0.68
oxide
0.67
pg
0.65
beck
0.65
Activations Density 0.249%