INDEX
Explanations
phrases indicating relational or directional contexts
New Auto-Interp
Negative Logits
latter
-0.22
/INFO
-0.16
uae
-0.15
AppBundle
-0.15
ksam
-0.14
جÙĪ
-0.14
klä
-0.14
828
-0.14
832
-0.13
ège
-0.13
POSITIVE LOGITS
gether
0.32
atre
0.24
ersen
0.20
ir
0.20
clusive
0.19
xic
0.19
oret
0.18
ilet
0.18
oner
0.18
asted
0.17
Activations Density 0.050%