INDEX
Explanations
code snippets
This neuron activates on words and tokens associated with requests to reorganize or rewrite text for clarity or formality.
New Auto-Interp
Negative Logits
FAIL
-0.07
-template
-0.07
FDA
-0.07
Letters
-0.07
arcer
-0.07
ka
-0.07
Points
-0.06
vinc
-0.06
_direct
-0.06
Diagnostic
-0.06
POSITIVE LOGITS
)이
0.07
€
0.06
ulong
0.06
досвід
0.06
====
0.06
어진
0.06
etxt
0.06
।
0.06
exceeding
0.06
CEEDED
0.06
Activations Density 0.054%