INDEX
Explanations
phrases related to physical conflicts or confrontations
phrases that indicate conflict or confrontation
New Auto-Interp
Negative Logits
redistributed
-0.77
iture
-0.77
imate
-0.76
icter
-0.75
nosis
-0.75
bably
-0.72
overed
-0.71
SOURCE
-0.71
clips
-0.71
igmatic
-0.71
POSITIVE LOGITS
regard
0.95
regards
0.85
stood
0.77
fellow
0.75
coworkers
0.73
neighbours
0.73
creditors
0.73
impunity
0.72
mobs
0.70
bandits
0.69
Activations Density 0.123%