INDEX
Explanations
The main thing this neuron does is find nouns related to legal and financial issues
proper nouns and specific names of people or entities
New Auto-Interp
Negative Logits
Nath
-0.92
ayne
-0.91
CLA
-0.88
Shepard
-0.86
Miranda
-0.80
ay
-0.74
anni
-0.74
pard
-0.74
aint
-0.72
Holden
-0.72
POSITIVE LOGITS
ub
1.44
Tub
1.43
Lob
1.39
Dob
1.39
Dub
1.33
Kub
1.31
Hub
1.30
Dub
1.29
UB
1.28
rub
1.27
Activations Density 0.425%