INDEX
Explanations
intensely emotional or passionate language
New Auto-Interp
Negative Logits
ambda
-0.16
orum
-0.15
-spacing
-0.15
_EST
-0.14
uteur
-0.14
vez
-0.14
ypse
-0.14
Prompt
-0.14
hea
-0.14
plaintext
-0.14
POSITIVE LOGITS
ness
0.29
pace
0.27
pursuit
0.24
-paced
0.22
ly
0.21
paced
0.20
NESS
0.19
Pace
0.18
nature
0.18
attack
0.17
Activations Density 0.102%