INDEX
    Explanations

    When looking through the activations provided, it appears this neuron is consistently finding occurrences of the word "take" in various contexts

    instances of the phrase "take a" followed by various contexts

    New Auto-Interp
    Negative Logits
    ndra
    -0.79
    ells
    -0.71
    Ü
    -0.69
     displays
    -0.68
    tions
    -0.66
    tu
    -0.65
    IAS
    -0.65
    IOR
    -0.64
    eller
    -0.64
    α
    -0.63
    POSITIVE LOGITS
     seriously
    0.91
     lightly
    0.84
     plunge
    0.83
     reins
    0.81
     cue
    0.78
     stride
    0.78
     tumble
    0.73
    ãĥīãĥ©
    0.70
     lesson
    0.67
     tack
    0.67
    Act Density 0.147%

    No Known Activations