INDEX
Explanations
phrases indicating composition or structure
New Auto-Interp
Negative Logits
سÙĦ
-0.16
Insecta
-0.16
lea
-0.15
ede
-0.14
Dise
-0.14
bsp
-0.14
gel
-0.14
stell
-0.14
Surre
-0.14
cko
-0.14
POSITIVE LOGITS
errat
0.16
ilda
0.16
ób
0.15
izr
0.15
hurst
0.15
ledon
0.14
normalize
0.14
onis
0.14
orer
0.14
alach
0.14
Activations Density 0.037%