INDEX
Explanations
numeric values and quantities
New Auto-Interp
Negative Logits
lama
-0.16
oyer
-0.16
.SC
-0.15
AIM
-0.15
kol
-0.14
plug
-0.14
ached
-0.14
бÑĥ
-0.14
ardon
-0.14
elp
-0.13
POSITIVE LOGITS
atas
0.16
anth
0.16
ané
0.15
rens
0.14
Verb
0.14
aklı
0.14
.jboss
0.14
axis
0.13
ish
0.13
aus
0.13
Activations Density 0.172%