INDEX
Explanations
political and social action phrases
instances of political engagement and community involvement
New Auto-Interp
Negative Logits
;;;;;;;;
-0.62
Moroc
-0.62
bourg
-0.62
never
-0.61
nown
-0.59
mars
-0.58
drowned
-0.58
accordingly
-0.58
00007
-0.57
Neh
-0.57
POSITIVE LOGITS
irtual
0.65
GROUP
0.61
issan
0.59
esta
0.58
pires
0.57
these
0.57
lict
0.57
cery
0.56
comes
0.55
hindsight
0.55
Activations Density 0.326%