INDEX
Explanations
proper names or nouns associated with individuals
New Auto-Interp
Negative Logits
afia
-0.19
ob
-0.16
AMP
-0.16
agn
-0.15
zs
-0.15
zu
-0.15
departure
-0.15
din
-0.15
depart
-0.15
.Lib
-0.14
POSITIVE LOGITS
anlay
0.16
Kro
0.16
stash
0.15
IRO
0.15
ursions
0.15
etur
0.14
ito
0.14
åĤ
0.14
ippet
0.14
iec
0.14
Activations Density 0.150%