INDEX
Explanations
specific terms related to biological or physical structure and processes
New Auto-Interp
Negative Logits
Anch
-0.15
attery
-0.15
Demir
-0.15
ivia
-0.15
ÙħÙĩد
-0.14
Anchor
-0.14
gebn
-0.14
orning
-0.14
ibilit
-0.14
ÑĤаблеÑĤ
-0.14
POSITIVE LOGITS
ua
0.17
iry
0.15
ate
0.15
rut
0.15
iro
0.15
uco
0.15
inka
0.14
idl
0.14
uang
0.14
Lang
0.14
Activations Density 0.017%