INDEX
Explanations
terms related to boundaries and interfaces between concepts or categories
New Auto-Interp
Negative Logits
apl
-0.17
/*@
-0.17
poo
-0.16
entai
-0.15
necessarily
-0.15
echa
-0.14
lek
-0.14
correspond
-0.14
uw
-0.14
IZE
-0.14
POSITIVE LOGITS
ething
0.21
between
0.20
giữa
0.17
.updateDynamic
0.17
Between
0.17
voke
0.15
междÑĥ
0.15
variant
0.15
between
0.15
аниÑĨ
0.15
Activations Density 0.208%