INDEX
Explanations
mentions of the name "Harris" in varying contexts
New Auto-Interp
Negative Logits
nces
-0.73
rious
-0.71
zzi
-0.69
eers
-0.64
erous
-0.63
privilege
-0.60
rous
-0.60
lihood
-0.59
unal
-0.59
uers
-0.58
POSITIVE LOGITS
burg
1.39
burgh
1.01
mann
0.97
abis
0.92
bury
0.89
acre
0.89
alez
0.88
bach
0.88
laughter
0.86
kins
0.85
Activations Density 0.020%