INDEX
Explanations
references to historical military events and their significance
New Auto-Interp
Negative Logits
qed
-0.16
Cecil
-0.15
avir
-0.15
bote
-0.14
riot
-0.14
poke
-0.14
avia
-0.14
ÑĨаÑĢ
-0.14
Ze
-0.14
enser
-0.14
POSITIVE LOGITS
Norm
0.37
Norm
0.27
liberation
0.26
landing
0.25
Liberation
0.25
norm
0.24
Landing
0.24
landing
0.24
liberated
0.23
beaches
0.22
Activations Density 0.072%