INDEX
Explanations
conjunctions and phrases that express exceptions or comparisons
New Auto-Interp
Negative Logits
auen
-0.16
ocre
-0.15
ieber
-0.15
Powder
-0.14
794
-0.14
γι
-0.14
xae
-0.14
missing
-0.14
inesis
-0.14
ebi
-0.14
POSITIVE LOGITS
ikon
0.18
elden
0.17
ãi
0.16
á»ĭ
0.16
ãĤµãĤ¤
0.15
ë²Ī
0.14
iniz
0.14
ogl
0.14
urdu
0.14
asl
0.14
Activations Density 0.286%