INDEX
Explanations
instances of gratitude or appreciation expressed in the text
New Auto-Interp
Negative Logits
spl
-0.15
лÑĥг
-0.14
utas
-0.14
зави
-0.14
ICLES
-0.14
istes
-0.13
ATIC
-0.13
TION
-0.13
isode
-0.13
ottes
-0.13
POSITIVE LOGITS
it
0.19
there
0.18
anky
0.17
we
0.16
hence
0.15
this
0.15
they
0.15
it
0.14
Carly
0.14
there
0.14
Activations Density 0.245%