INDEX
Explanations
names of political figures and terms related to political discussions
New Auto-Interp
Negative Logits
amaru
-0.67
tom
-0.63
vertisements
-0.61
nova
-0.60
ezvous
-0.60
Thumbnail
-0.60
venants
-0.59
oshenko
-0.58
nown
-0.58
ril
-0.58
POSITIVE LOGITS
sake
1.36
purposes
1.25
reasons
1.05
reason
0.70
erning
0.69
icion
0.69
lovers
0.67
alike
0.65
aspiring
0.64
ummies
0.63
Activations Density 0.194%