INDEX
    Explanations

    words with the suffix '-y' followed by a strong activation value, particularly 'y' itself

    occurrences of the letter 'y'

    New Auto-Interp
    Negative Logits
    raltar
    -0.81
     Examiner
    -0.78
    IBLE
    -0.70
    icably
    -0.68
    insula
    -0.68
    itures
    -0.67
    bernatorial
    -0.65
    ãĥ¯
    -0.63
     vou
    -0.63
     foss
    -0.63
    POSITIVE LOGITS
    ield
    1.02
    ielding
    0.99
    Å«
    0.92
    aku
    0.90
    ank
    0.86
    olk
    0.85
    mbol
    0.81
    ikes
    0.81
    ng
    0.80
    STEM
    0.80
    Act Density 0.055%

    No Known Activations