INDEX
Explanations
dates from the mid-20th century
references to significant historical years
New Auto-Interp
Negative Logits
parency
-0.71
edo
-0.69
resso
-0.68
clus
-0.65
venge
-0.65
opher
-0.65
hed
-0.64
verbs
-0.63
unity
-0.62
impunity
-0.61
POSITIVE LOGITS
å¹
0.77
-'
0.75
keley
0.73
çļ
0.72
1945
0.65
MacArthur
0.64
1943
0.64
Reincarn
0.64
1944
0.63
Reich
0.63
Activations Density 0.041%