INDEX
    Explanations

    mentions of the letter "A" with high activation values

    the letter 'A' in various contexts

    New Auto-Interp
    Negative Logits
     Vaugh
    -0.67
    enegger
    -0.65
     Alph
    -0.63
     Indigo
    -0.60
    reports
    -0.60
    email
    -0.59
     Nicaragua
    -0.58
     optics
    -0.57
     Everton
    -0.57
    Param
    -0.57
    POSITIVE LOGITS
    cknowled
    1.19
    verages
    1.18
    ussie
    1.17
    cknow
    1.11
    uctions
    1.10
    irst
    1.01
    lyss
    0.98
    roma
    0.98
    ryan
    0.96
    perture
    0.94
    Act Density 0.105%

    No Known Activations