INDEX
Explanations
phrases related to suffering or distress
terms related to suffocation or significant constraints
New Auto-Interp
Negative Logits
Pegasus
-0.82
Drake
-0.72
Pic
-0.71
Kislyak
-0.69
Oak
-0.68
rake
-0.67
tec
-0.65
hran
-0.65
Onion
-0.65
ho
-0.63
POSITIVE LOGITS
suff
4.27
Suff
2.64
suff
2.22
drown
1.10
ana
1.00
suffice
1.00
paraly
0.94
murd
0.92
inhal
0.92
stricken
0.91
Activations Density 0.017%