INDEX
Explanations
specific terms related to a political context, such as names and policies
references to political figures and related terms
New Auto-Interp
Negative Logits
hran
-0.76
ileaks
-0.71
ortment
-0.71
ingham
-0.71
itle
-0.71
estate
-0.70
ering
-0.69
umi
-0.69
Sakuya
-0.69
achi
-0.68
POSITIVE LOGITS
cffffcc
0.74
forth
0.73
utive
0.70
olds
0.70
served
0.69
ãĥ«
0.69
oute
0.66
cers
0.66
aded
0.66
Charge
0.65
Activations Density 0.077%