INDEX
Explanations
mentions of a specific entity or concept, "Eraser"
references to individuals with the name "Er" in various contexts
New Auto-Interp
Negative Logits
xual
-0.85
creen
-0.73
smith
-0.71
Parables
-0.70
emouth
-0.69
Gazette
-0.69
erness
-0.69
eenth
-0.68
ciating
-0.67
unct
-0.65
POSITIVE LOGITS
rors
1.04
aser
0.99
asure
0.89
asures
0.89
bil
0.87
uner
0.86
Lauder
0.85
Er
0.82
Dame
0.80
oute
0.77
Activations Density 0.007%