INDEX
    Explanations

    The neuron is looking for instances where the phrase "I mean" is used in a sentence

    phrases indicating human behavior or inclinations

    New Auto-Interp
    Negative Logits
    artifacts
    -0.73
    ifference
    -0.70
    gression
    -0.68
    htaking
    -0.63
    watching
    -0.62
     threads
    -0.61
     imped
    -0.59
    paying
    -0.59
    immune
    -0.59
    udeau
    -0.58
    POSITIVE LOGITS
     acronym
    1.47
     slang
    1.35
     moniker
    1.34
     abbre
    1.33
     coined
    1.31
     term
    1.29
     nickname
    1.28
     name
    1.23
     referring
    1.15
     shorthand
    1.14
    Act Density 0.670%

    No Known Activations