INDEX
    Explanations

    This neuron is looking for words related to imperfection or flaws

    terms related to permanence and inevitability

    New Auto-Interp
    Negative Logits
    anwhile
    -0.85
    wagen
    -0.80
     guiActiveUnfocused
    -0.70
     CDs
    -0.68
    âĸ¬
    -0.67
     GOODMAN
    -0.66
    hops
    -0.65
    hare
    -0.65
    WAYS
    -0.64
    ãĥ¯ãĥ³
    -0.64
    POSITIVE LOGITS
    vious
    1.13
    ishable
    1.13
    missible
    1.10
    manent
    1.09
    iled
    0.89
    mented
    0.89
    bably
    0.89
    pex
    0.87
    redict
    0.86
    igen
    0.85
    Act Density 0.012%

    No Known Activations