INDEX
Explanations
historical references to significant events and societal changes
New Auto-Interp
Negative Logits
Rosenstein
-0.16
neys
-0.15
Biden
-0.15
(*((
-0.15
oplay
-0.15
اÙĦØŃر
-0.14
iese
-0.14
Pence
-0.14
reesome
-0.13
kiếm
-0.13
POSITIVE LOGITS
WWII
0.24
193
0.22
during
0.22
191
0.21
194
0.21
WW
0.21
196
0.20
192
0.20
195
0.20
during
0.18
Activations Density 0.221%