INDEX
Explanations
references to alternative options or subjects in various contexts
New Auto-Interp
Negative Logits
åı¦ä¸Ģ
-0.16
other
-0.16
otherwise
-0.16
uel
-0.16
autre
-0.16
autres
-0.15
lain
-0.15
amen
-0.15
Other
-0.15
anner
-0.15
POSITIVE LOGITS
-than
0.35
than
0.33
niż
0.29
world
0.28
ewise
0.28
wis
0.28
than
0.27
equally
0.25
_than
0.24
similarly
0.24
Activations Density 0.110%