INDEX
Explanations
conditional phrases and expressions of approval or disapproval
New Auto-Interp
Negative Logits
ãĤĦãģ£ãģ¦
-0.15
Všech
-0.15
мÑĸв
-0.15
yš
-0.15
ambre
-0.15
ylko
-0.14
Ñħодим
-0.14
ê¸Ī
-0.14
élé
-0.14
_TAC
-0.13
POSITIVE LOGITS
inox
0.18
enheim
0.17
Podesta
0.16
bris
0.15
McK
0.15
tob
0.15
zb
0.14
illard
0.14
coal
0.14
quiz
0.14
Activations Density 0.082%