INDEX
Explanations
general text
structured arguments and formal writing characteristics in essays.
This neuron selectively activates on words that appear at the start of sentences or document sections.
New Auto-Interp
Negative Logits
Dll
-0.07
downtown
-0.07
Going
-0.06
slain
-0.06
journalist
-0.06
Walt
-0.06
broadcaster
-0.06
Thread
-0.06
unge
-0.06
wood
-0.06
POSITIVE LOGITS
itori
0.07
thereby
0.07
попыт
0.06
ijd
0.06
rowave
0.06
MEDIA
0.06
-if
0.06
knife
0.06
这样
0.06
momentos
0.06
Activations Density 0.071%