INDEX
Explanations
Instances of emphasis or exaggeration in text
New Auto-Interp
Negative Logits
76561
-0.79
inarily
-0.71
ords
-0.69
issance
-0.69
riots
-0.68
craft
-0.68
enance
-0.63
isode
-0.63
iens
-0.62
legram
-0.62
POSITIVE LOGITS
far
0.83
much
0.82
busy
0.82
distracting
0.81
risky
0.81
tempting
0.80
simplistic
0.80
afraid
0.78
costly
0.75
cozy
0.74
Activations Density 0.042%