INDEX
Explanations
instances where a concept or entity is revealed or highlighted
New Auto-Interp
Negative Logits
íĢ
-0.15
دÙĬØ«
-0.14
thumb
-0.14
ÑĤа
-0.13
angan
-0.13
ylland
-0.13
Kub
-0.13
stit
-0.13
веÑĤ
-0.13
Defined
-0.13
POSITIVE LOGITS
ño
0.16
and
0.15
Ïĥκε
0.14
eson
0.13
Moran
0.13
iev
0.13
rello
0.13
nomin
0.13
боÑĤ
0.13
igar
0.13
Activations Density 0.019%