INDEX
    Explanations

    advice/warnings

    This neuron activates on imperative or warning language—phrases giving instructions or urging action (e.g. “pay attention,” “run away,” “watch out”).

    New Auto-Interp
    Negative Logits
    gae
    -0.07
    Airport
    -0.06
    deer
    -0.06
    .tick
    -0.06
    .numericUpDown
    -0.06
     unsuccessful
    -0.06
     रव
    -0.06
    Price
    -0.06
     Иванов
    -0.06
    пов
    -0.06
    POSITIVE LOGITS
     aşam
    0.07
    uib
    0.07
     Reduce
    0.07
    >,
    0.07
    .bo
    0.07
    0.07
    يب
    0.06
     inhab
    0.06
    سب
    0.06
    (argc
    0.06
    Act Density 0.046%

    No Known Activations