INDEX
Explanations
references to historical events and their implications
New Auto-Interp
Negative Logits
195
-0.25
196
-0.23
197
-0.20
tv
-0.20
198
-0.20
_TV
-0.20
194
-0.19
televised
-0.19
televis
-0.19
television
-0.19
POSITIVE LOGITS
191
0.60
192
0.48
WW
0.35
Bolshevik
0.33
190
0.28
Bols
0.26
Û±Û¹
0.21
bols
0.21
twenties
0.21
WW
0.20
Activations Density 0.293%