INDEX
Explanations
references to Russian President Vladimir Putin
references to Vladimir Putin
New Auto-Interp
Negative Logits
dress
-0.83
aver
-0.72
BACK
-0.72
eat
-0.71
alach
-0.70
cular
-0.69
pel
-0.68
dash
-0.67
kick
-0.67
cule
-0.65
POSITIVE LOGITS
Putin
1.27
Vladimir
1.15
Vlad
1.05
Jinping
1.04
Dmitry
1.03
Ily
1.02
Nab
0.97
Lenin
0.90
Tayyip
0.87
Dmit
0.85
Activations Density 0.004%