INDEX
Explanations
phrases advocating for social change and collective action
New Auto-Interp
Negative Logits
ipa
-0.16
Monaco
-0.15
IRST
-0.15
ingu
-0.14
leigh
-0.14
VIP
-0.14
Americans
-0.14
udden
-0.13
anth
-0.13
Uniform
-0.13
POSITIVE LOGITS
workers
0.27
Workers
0.27
struggles
0.27
struggle
0.25
workers
0.24
prolet
0.24
Workers
0.23
worker
0.23
Worker
0.22
Trotsky
0.22
Activations Density 0.212%