INDEX
Explanations
instances of support, opposition, and legal terminology related to social and political issues
New Auto-Interp
Negative Logits
esi
-0.18
964
-0.14
edl
-0.14
:frame
-0.14
election
-0.13
Projection
-0.13
ÙħاÙĨÛĮ
-0.13
homophobic
-0.13
tridge
-0.13
vero
-0.13
POSITIVE LOGITS
/op
0.18
notion
0.17
idea
0.17
htable
0.17
having
0.15
continuation
0.15
fellow
0.14
Idea
0.14
cepts
0.14
зÑĮ
0.14
Activations Density 0.166%