INDEX
Explanations
The neuron activates on occurrences of the name “Martin.”
New Auto-Interp
Negative Logits
Zoo
-0.07
Yellow
-0.07
819
-0.07
879
-0.07
Encore
-0.07
Peach
-0.07
Face
-0.06
अक
-0.06
Byrne
-0.06
rible
-0.06
POSITIVE LOGITS
Martin
0.17
Martin
0.13
martin
0.11
Marty
0.09
Craig
0.08
Lewis
0.08
TG
0.08
Martins
0.08
Ross
0.08
Gordon
0.08
Activations Density 0.017%