INDEX
Explanations
This neuron fires on the word “device” (and related patent-style phrasing around invention descriptions).
New Auto-Interp
Negative Logits
Precio
-0.07
RowAtIndexPath
-0.06
ORIZATION
-0.06
Appointment
-0.06
Personen
-0.06
-from
-0.06
zeigt
-0.06
wd
-0.06
altet
-0.06
_extended
-0.06
POSITIVE LOGITS
Institute
0.06
deposit
0.06
exampleInputEmail
0.06
Magical
0.06
λογή
0.06
TOM
0.06
şam
0.06
snatch
0.06
Seb
0.06
BEGIN
0.06
Activations Density 0.039%