INDEX
Explanations
The neuron fires on tokens that signal personal, first-person statements (e.g. “I,” “me,” “my,” subjective qualifiers), i.e. it picks out writer’s self-references.
New Auto-Interp
Negative Logits
bent
-0.06
ontology
-0.06
sentence
-0.06
Tra
-0.06
.Details
-0.06
ypse
-0.06
petition
-0.06
ukes
-0.06
_prov
-0.06
Camera
-0.06
POSITIVE LOGITS
>_
0.07
_FEED
0.07
Schedulers
0.07
Hãy
0.07
=[[
0.07
Hil
0.06
Pt
0.06
scoped
0.06
[attr
0.06
nech
0.06
Activations Density 0.025%