INDEX
Explanations
references to a specific person named "Brock"
the presence of the name "Brock" in various contexts
New Auto-Interp
Negative Logits
gered
-0.72
conspicuous
-0.69
dfx
-0.67
ffic
-0.67
SERVICE
-0.66
ired
-0.65
cytok
-0.63
CoC
-0.62
udic
-0.62
socket
-0.61
POSITIVE LOGITS
Brock
1.04
lings
0.98
mire
0.92
leys
0.86
halla
0.86
Upton
0.86
hurst
0.84
strap
0.83
stre
0.83
ham
0.82
Activations Density 0.008%