INDEX
Explanations
the name "Putin" in various contexts
mentions of Vladimir Putin
New Auto-Interp
Negative Logits
Thom
-0.80
âĢ¢âĢ¢âĢ¢âĢ¢
-0.77
ibles
-0.75
ITNESS
-0.75
IENT
-0.74
ANE
-0.73
ppelin
-0.70
IRD
-0.69
Honolulu
-0.69
Kear
-0.67
POSITIVE LOGITS
Jinping
1.14
cius
0.95
annexed
0.94
Putin
0.91
otti
0.84
iets
0.81
rall
0.80
dictator
0.77
dictatorship
0.76
sky
0.76
Activations Density 0.020%