INDEX
Explanations
phrases questioning reasons or causes
questions and phrases emphasizing the concept of "why" or reasons behind actions and situations
New Auto-Interp
Negative Logits
aughed
-0.86
iece
-0.84
lator
-0.80
ibaba
-0.79
iHUD
-0.70
yi
-0.68
ymph
-0.68
utenberg
-0.67
arya
-0.67
ãĤ´ãĥ³
-0.64
POSITIVE LOGITS
soever
0.96
people
0.85
they
0.85
bother
0.80
exactly
0.79
someone
0.79
somebody
0.76
anyone
0.73
why
0.71
we
0.70
Activations Density 0.045%