INDEX
Explanations
emotional language related to personal feelings and beliefs
references to emotional expressions and existential themes
New Auto-Interp
Negative Logits
peacefully
-0.61
online
-0.59
arthed
-0.58
NOW
-0.58
appropriately
-0.58
Together
-0.57
together
-0.56
anwhile
-0.56
edIn
-0.56
aughs
-0.54
POSITIVE LOGITS
toward
0.77
to
0.77
towards
0.75
guiName
0.63
to
0.63
for
0.62
)=(
0.61
âĢ
0.60
whereby
0.58
âĢ
0.58
Activations Density 0.981%