INDEX
Explanations
phrases related to coercion and pressure
New Auto-Interp
Negative Logits
vironment
-0.89
obyl
-0.82
sterdam
-0.80
mbuds
-0.80
nam
-0.80
namese
-0.79
çĦ
-0.74
ership
-0.74
nect
-0.69
said
-0.68
POSITIVE LOGITS
exerted
1.07
cooker
1.04
wedge
0.86
compel
0.82
pressuring
0.78
harder
0.77
compulsion
0.77
levers
0.74
pressure
0.74
pedal
0.74
Activations Density 2.286%