INDEX
Explanations
phrases related to blaming others or taking responsibility
New Auto-Interp
Negative Logits
marked
-0.95
inational
-0.89
intend
-0.85
mark
-0.82
mental
-0.79
edom
-0.79
spect
-0.79
marks
-0.76
mens
-0.76
perties
-0.76
POSITIVE LOGITS
blame
0.94
blaming
0.89
scapego
0.83
blames
0.80
blamed
0.76
lash
0.76
oshop
0.76
Gateway
0.75
victim
0.74
culprit
0.73
Activations Density 9.535%