INDEX
Explanations
names of people and places
New Auto-Interp
Negative Logits
kud
-0.17
/place
-0.16
rus
-0.16
assis
-0.16
wives
-0.16
warts
-0.15
akin
-0.15
oxel
-0.15
sáng
-0.15
ios
-0.15
POSITIVE LOGITS
mallow
0.21
pike
0.18
borough
0.18
juana
0.17
thon
0.17
chal
0.16
bles
0.16
inel
0.16
Bold
0.16
andise
0.15
Activations Density 0.029%