INDEX
Explanations
morality
This neuron activates on philosophical and ethics‐related terminology, e.g. words about morality, ethics, ontology, philosophy.
New Auto-Interp
Negative Logits
xpath
-0.06
Salt
-0.06
态
-0.06
FOR
-0.06
До
-0.06
AccessType
-0.06
straight
-0.06
map
-0.06
gross
-0.06
edula
-0.06
POSITIVE LOGITS
rowned
0.06
y
0.06
yb
0.06
_lcd
0.06
musique
0.06
incredible
0.06
:@"%@",
0.06
Fifth
0.06
<|begin_of_text|>
0.06
/mp
0.06
Activations Density 0.041%