INDEX
    Explanations

    punctuation

    This neuron detects instructional phrases that specify using only information from the given documents.

    New Auto-Interp
    Negative Logits
     bullshit
    -0.07
     Kew
    -0.06
    -0.06
     ki
    -0.06
    ABCDEFG
    -0.06
    _iteration
    -0.06
     grooming
    -0.06
    Storyboard
    -0.06
     Particle
    -0.06
    890
    -0.06
    POSITIVE LOGITS
     عز
    0.07
     Follow
    0.07
    hpp
    0.06
     вне
    0.06
     valued
    0.06
    0.06
    bohydr
    0.06
     principales
    0.06
     фор
    0.06
    ger
    0.06
    Act Density 0.024%

    No Known Activations