INDEX
    Explanations

    the word "ache" with varying activations

    New Auto-Interp
    Negative Logits
    DERR
    -0.70
    ODUCT
    -0.68
     anomal
    -0.62
     introductory
    -0.62
     chrom
    -0.61
     Libertarian
    -0.59
    POL
    -0.59
    Trend
    -0.58
     derog
    -0.58
     predatory
    -0.58
    POSITIVE LOGITS
    ache
    1.13
    rette
    1.05
    ternity
    0.95
    tto
    0.94
    lla
    0.91
    phrine
    0.90
    chet
    0.90
    agne
    0.88
    ments
    0.87
    utic
    0.86
    Act Density 0.007%

    No Known Activations