INDEX
Explanations
mentions of political figures and their roles
New Auto-Interp
Negative Logits
parteci
-0.53
出版年
-0.52
vieve
-0.48
excused
-0.46
expandindo
-0.46
selve
-0.44
Коро
-0.43
participação
-0.43
individuel
-0.43
pierde
-0.42
POSITIVE LOGITS
policies
0.92
predecessor
0.88
reforms
0.85
revital
0.83
AndEndTag
0.81
oversaw
0.80
Policies
0.80
restructuring
0.77
succeeded
0.76
reform
0.75
Activations Density 0.279%