INDEX
    Explanations

    This neuron detects and highlights instructions or statements assessing whether a given text is describing a real (versus fake) human.

    New Auto-Interp
    Negative Logits
     Sed
    -0.06
     сель
    -0.06
    ongoose
    -0.06
     _('
    -0.06
    .dumps
    -0.06
    ůvodu
    -0.06
    CMS
    -0.06
         
    -0.06
     bufsize
    -0.06
    --------------
    -0.06
    POSITIVE LOGITS
     unt
    0.07
     vd
    0.06
     světa
    0.06
     electromagnetic
    0.06
     Scientific
    0.06
     Inches
    0.06
     evils
    0.06
     stere
    0.06
    abolic
    0.06
    那个
    0.06
    Act Density 0.023%

    No Known Activations