INDEX
Explanations
sentences expressing disappointment in potential and unrealized expectations
New Auto-Interp
Negative Logits
agher
-0.17
-pagination
-0.15
amilia
-0.15
ãĥ¼ãĥģ
-0.15
parison
-0.14
à¥ģब
-0.14
por
-0.14
ibaba
-0.14
chner
-0.14
atrix
-0.14
POSITIVE LOGITS
727
0.18
226
0.15
661
0.15
fasc
0.15
vala
0.14
@c
0.14
697
0.13
eral
0.13
_lc
0.13
jax
0.13
Activations Density 0.259%