INDEX
Explanations
instances where something is being compelled or required
instances of coercion or compulsion
New Auto-Interp
Negative Logits
ership
-0.79
heimer
-0.74
Purpose
-0.70
Excellence
-0.68
ahu
-0.66
nam
-0.64
Blessed
-0.63
ihar
-0.63
yss
-0.63
iT
-0.63
POSITIVE LOGITS
forcing
0.76
overtime
0.73
cooker
0.73
force
0.72
forced
0.72
exerted
0.72
ansson
0.70
overpowered
0.69
unbeliev
0.67
coer
0.66
Activations Density 0.017%