INDEX
    Explanations

    caught, discovered

    The neuron detects advice or instructions focused on avoiding detection or being caught after committing wrongdoing.

    New Auto-Interp
    Negative Logits
    -block
    -0.08
     Trong
    -0.07
     стен
    -0.07
    Su
    -0.06
     mundane
    -0.06
     Sterling
    -0.06
    문의
    -0.06
     chaotic
    -0.06
     Pru
    -0.06
    *((
    -0.06
    POSITIVE LOGITS
    _NAME
    0.06
    uggestions
    0.06
    lest
    0.06
    UTIL
    0.06
    振り
    0.06
     empt
    0.06
    $title
    0.05
    ϊ
    0.05
    VAL
    0.05
     reperc
    0.05
    Act Density 0.009%

    No Known Activations