INDEX
Explanations
the word "Kon" in various contexts
New Auto-Interp
Negative Logits
ece
-0.20
ylon
-0.17
quo
-0.17
eous
-0.17
yth
-0.17
e
-0.16
eut
-0.16
ei
-0.16
eing
-0.16
eah
-0.16
POSITIVE LOGITS
stant
0.29
rad
0.25
igs
0.24
stan
0.21
kre
0.21
cert
0.20
trap
0.20
راد
0.19
ÏĥÏĦαν
0.19
f
0.19
Activations Density 0.009%