INDEX
Explanations
phrases and expressions that convey feelings and opinions about quality, improvement, or disappointment
New Auto-Interp
Negative Logits
atik
-0.18
ouch
-0.18
nal
-0.16
ubar
-0.16
isko
-0.15
Bias
-0.15
bak
-0.15
ceed
-0.14
urance
-0.14
zych
-0.14
POSITIVE LOGITS
better
0.88
better
0.76
Better
0.75
Better
0.70
BET
0.66
bet
0.63
mejor
0.63
лÑĥÑĩÑĪе
0.60
besser
0.58
melhor
0.58
Activations Density 0.251%