INDEX
Explanations
instances where someone expresses positive feelings about something
sentiments related to feeling "good" or "bad" about various experiences or situations
New Auto-Interp
Negative Logits
DragonMagazine
-0.80
hiba
-0.79
¯¯¯¯¯¯¯¯
-0.77
hesis
-0.75
Write
-0.73
hai
-0.72
ulum
-0.71
SPONSORED
-0.70
Written
-0.70
------------------------
-0.70
POSITIVE LOGITS
reforming
0.80
donating
0.79
marrying
0.79
reducing
0.79
quitting
0.78
preserving
0.77
committing
0.76
trusting
0.76
losing
0.75
how
0.75
Activations Density 0.063%