INDEX
Negative Logits
_In
-0.07
col
-0.07
�
-0.06
014
-0.06
Marie
-0.06
exhibition
-0.06
dining
-0.06
727
-0.06
ior
-0.06
RowBox
-0.06
POSITIVE LOGITS
Russia
0.11
Russ
0.10
Russell
0.10
Russo
0.09
Russian
0.09
Russ
0.08
Russians
0.08
Putin
0.08
Russia
0.08
Russian
0.07
Activations Density 0.028%