INDEX
Explanations
The neuron is activated by the suffix “ize,” i.e. it detects words ending with “ize.”
New Auto-Interp
Negative Logits
unless
-0.07
dart
-0.07
acts
-0.07
hold
-0.06
occult
-0.06
ought
-0.06
sixth
-0.06
letting
-0.06
achieving
-0.06
let
-0.06
POSITIVE LOGITS
ize
0.16
ized
0.14
ization
0.13
izes
0.12
iz
0.12
IZ
0.11
IZED
0.11
ise
0.11
izing
0.11
из
0.11
Activations Density 0.131%