INDEX
Explanations
phrases that indicate choice or alternatives
New Auto-Interp
Negative Logits
Either
-0.21
Either
-0.19
either
-0.19
både
-0.18
either
-0.17
enheim
-0.17
nejen
-0.16
aday
-0.15
anca
-0.15
nawet
-0.15
POSITIVE LOGITS
way
0.31
/or
0.30
-way
0.28
side
0.28
way
0.26
/all
0.23
-than
0.23
party
0.23
-or
0.23
directly
0.22
Activations Density 0.025%