INDEX
Explanations
mentions of specific historical years and events
New Auto-Interp
Negative Logits
_INET
-0.17
gow
-0.16
uing
-0.15
isci
-0.15
atha
-0.15
ycz
-0.14
oru
-0.14
ilter
-0.14
urg
-0.14
caster
-0.14
POSITIVE LOGITS
Dun
0.16
Sing
0.15
mad
0.15
dun
0.15
ener
0.14
traceback
0.14
467
0.14
Dunn
0.14
ty
0.14
å¯Ħ
0.14
Activations Density 0.011%