INDEX
Explanations
This neuron activates on mentions of the word “life.”
New Auto-Interp
Negative Logits
_fa
-0.08
fk
-0.07
tilted
-0.07
ès
-0.07
billing
-0.07
.creator
-0.07
_Arg
-0.07
etadata
-0.06
Heal
-0.06
इक
-0.06
POSITIVE LOGITS
life
0.10
Life
0.08
LIFE
0.07
-life
0.06
lifes
0.06
lif
0.06
live
0.06
dní
0.06
меди
0.06
happiness
0.06
Activations Density 0.017%