INDEX
Explanations
mentions of specific locations or incidents involving them
New Auto-Interp
Negative Logits
merce
-0.86
ratulations
-0.83
faced
-0.81
hai
-0.70
mun
-0.70
succeeded
-0.69
tailed
-0.69
rils
-0.68
busters
-0.68
Mask
-0.65
POSITIVE LOGITS
afar
1.73
whence
1.30
scratch
1.18
abroad
1.08
thence
1.00
inside
0.96
inception
0.94
infancy
0.87
obscurity
0.87
within
0.87
Activations Density 2.147%