INDEX
Explanations
phrases indicating a purpose or intention
New Auto-Interp
Negative Logits
º
-0.15
opis
-0.14
ORMAT
-0.14
λεί
-0.14
sembl
-0.14
igu
-0.13
dad
-0.13
erten
-0.13
970
-0.13
hop
-0.13
POSITIVE LOGITS
ges
0.20
purposes
0.18
unto
0.16
amina
0.16
uyla
0.16
INED
0.15
sake
0.15
oble
0.15
ged
0.15
reasons
0.15
Activations Density 0.259%