INDEX
Explanations
This neuron detects and highlights instructions or statements assessing whether a given text is describing a real (versus fake) human.
New Auto-Interp
Negative Logits
Sed
-0.06
сель
-0.06
ongoose
-0.06
_('-0.06
.dumps
-0.06
ůvodu
-0.06
CMS
-0.06
-0.06
bufsize
-0.06
--------------
-0.06
POSITIVE LOGITS
unt
0.07
vd
0.06
světa
0.06
electromagnetic
0.06
Scientific
0.06
Inches
0.06
evils
0.06
stere
0.06
abolic
0.06
那个
0.06
Activations Density 0.023%