INDEX
Explanations
affection
The neuron activates on descriptive words that highlight a pet’s warm, friendly, and sociable qualities (e.g. affectionate, playful, gentle, good with children).
New Auto-Interp
Negative Logits
concludes
-0.07
aks
-0.07
inds
-0.07
кі
-0.06
aland
-0.06
dbContext
-0.06
soul
-0.06
ós
-0.06
iking
-0.06
that
-0.06
POSITIVE LOGITS
Disconnect
0.07
останов
0.07
.Custom
0.06
(userid
0.06
(access
0.06
.pre
0.06
scrambled
0.06
.setString
0.06
getActivity
0.06
gag
0.06
Activations Density 0.012%