INDEX
Explanations
mentions of a particular individual named Johnson
instances of the name "Johnson."
New Auto-Interp
Negative Logits
fuse
-0.71
rises
-0.68
mble
-0.68
nces
-0.67
PDATE
-0.67
amplification
-0.66
peria
-0.63
clearing
-0.63
downside
-0.62
thur
-0.62
POSITIVE LOGITS
Johnson
1.07
stown
1.04
Controls
0.98
Johnson
0.97
ston
0.95
STON
0.92
ota
0.88
son
0.82
inson
0.80
Skywalker
0.80
Activations Density 0.011%