INDEX
    Explanations

    This neuron detects words and short phrases that signal the reader hasn’t yet done or seen something (negations like “haven’t,” “not,” “yet,” “already”).

    New Auto-Interp
    Negative Logits
     DAY
    -0.07
     divisor
    -0.07
     Shoot
    -0.06
     Pass
    -0.06
    -0.06
     dress
    -0.06
     Self
    -0.06
     когда
    -0.06
    subscriptions
    -0.06
     Hard
    -0.06
    POSITIVE LOGITS
     utilized
    0.07
    :↵
    0.07
    ourced
    0.07
     phận
    0.06
     đ
    0.06
    lenmiş
    0.06
     xAxis
    0.06
     všech
    0.06
    duğ
    0.06
     ゙
    0.06
    Act Density 0.010%

    No Known Activations