INDEX
Explanations
words related to political or governmental themes
New Auto-Interp
Negative Logits
DATA
-0.71
llan
-0.65
Factor
-0.59
Debor
-0.59
Citizen
-0.59
natureconservancy
-0.58
FANTASY
-0.56
ãĥ¼ãĥĨãĤ£
-0.56
Glass
-0.55
âĸijâĸij
-0.55
POSITIVE LOGITS
ipop
1.19
ateral
1.14
phe
1.09
okia
1.06
onial
1.02
oad
0.95
ounge
0.95
ibrary
0.94
ocation
0.93
anguages
0.93
Activations Density 0.011%