INDEX
Explanations
the name "Putin."
mentions of Vladimir Putin
New Auto-Interp
Negative Logits
âĢ¢âĢ¢âĢ¢âĢ¢
-0.81
ITNESS
-0.75
Hop
-0.74
Thom
-0.72
WORK
-0.71
ORN
-0.68
IRE
-0.68
AY
-0.68
lihood
-0.67
ergy
-0.67
POSITIVE LOGITS
Jinping
1.16
achev
1.03
Putin
1.00
Putin
0.98
oleon
0.89
rall
0.87
iets
0.87
cius
0.85
enei
0.84
otti
0.80
Activations Density 0.006%