INDEX
Explanations
expressions of strong emotional reactions
New Auto-Interp
Negative Logits
NUMX
-0.75
XNUMX
-0.75
\"");
-0.70
sidemargin
-0.68
kloped
-0.67
]<<
-0.63
awsze
-0.63
".
-0.62
$.
-0.62
styleType
-0.61
POSITIVE LOGITS
,
0.75
SpringRunner
0.72
Whoa
0.69
!
0.67
Wow
0.64
Whoa
0.63
Nope
0.60
hey
0.60
yeah
0.58
Hey
0.58
Activations Density 0.177%