INDEX
Explanations
The neuron activates on occurrences of the word “machine.”
New Auto-Interp
Negative Logits
longevity
-0.08
or
-0.07
redd
-0.07
plo
-0.07
-0.06
Corey
-0.06
ียนบ
-0.06
Ortiz
-0.06
943
-0.06
readers
-0.06
POSITIVE LOGITS
machine
0.15
machines
0.13
Machine
0.13
Machine
0.12
MACHINE
0.11
Machines
0.11
machining
0.11
machinery
0.10
_machine
0.09
машин
0.09
Activations Density 0.025%