INDEX
Explanations
terms related to Nazi Germany and historical events associated with it
New Auto-Interp
Negative Logits
imos
-0.16
Pascal
-0.15
okin
-0.15
ft
-0.15
ibble
-0.13
onaut
-0.13
anarch
-0.13
orra
-0.13
Merc
-0.13
/us
-0.13
POSITIVE LOGITS
Hitler
0.47
Hit
0.41
Nazi
0.41
Nazis
0.38
naz
0.36
SS
0.35
Naz
0.35
Naz
0.35
NS
0.34
Hit
0.34
Activations Density 0.152%