INDEX
Explanations
The neuron fires on evaluative, opinionated adjectives and descriptors (e.g. flattering, perfect, misleading) that convey subjective judgments about appearance or fit.
New Auto-Interp
Negative Logits
subsidiary
-0.07
acerb
-0.07
Saunders
-0.06
diligent
-0.06
Nero
-0.06
ric
-0.06
.output
-0.06
रहत
-0.06
milan
-0.06
kB
-0.06
POSITIVE LOGITS
tkinter
0.07
_INTEGER
0.06
ακό
0.06
thậm
0.06
ситуа
0.06
apons
0.06
_REPLY
0.06
patient
0.05
uxe
0.05
يلة
0.05
Activations Density 0.125%