INDEX
Explanations
This neuron activates on occurrences of the word “browser.”
New Auto-Interp
Negative Logits
Install
-0.08
inq
-0.07
anj
-0.06
trail
-0.06
element
-0.06
010
-0.06
task
-0.06
conte
-0.06
uite
-0.06
tank
-0.06
POSITIVE LOGITS
Browser
0.13
browser
0.12
browser
0.10
browsers
0.10
Browser
0.09
Bieber
0.08
-browser
0.08
antwort
0.08
0.08
boredom
0.08
Activations Density 0.010%