INDEX
Explanations
statements related to representation and political will
New Auto-Interp
Negative Logits
á»ij
-0.15
ajas
-0.15
æļ
-0.15
engin
-0.14
wildcard
-0.14
arsing
-0.14
adr
-0.14
privilege
-0.14
ffen
-0.14
priv
-0.13
POSITIVE LOGITS
struggle
0.21
Resistance
0.21
shoulder
0.19
firm
0.19
brothers
0.19
resistance
0.19
struggles
0.19
Resistance
0.18
èĤ©
0.18
strugg
0.18
Activations Density 0.133%