INDEX
Explanations
words related to emotions or sentiments of varying intensity
occurrences of the word "mo," likely looking for references to emotions or mood-related speech
New Auto-Interp
Negative Logits
Interstitial
-0.73
ire
-0.71
Integrity
-0.64
ICLE
-0.62
rating
-0.59
stood
-0.58
AMS
-0.58
caution
-0.58
responsibility
-0.57
AAP
-0.57
POSITIVE LOGITS
ovies
0.96
ogle
0.94
zilla
0.91
ose
0.89
aned
0.89
orthy
0.87
obil
0.87
edia
0.86
xon
0.84
asters
0.84
Activations Density 0.019%