INDEX
    Explanations

    instances of the word "Chart", with a preference for higher activations

    occurrences of the word "chart" and its variations

    New Auto-Interp
    Negative Logits
     Romeo
    -0.68
    ignt
    -0.66
    vae
    -0.64
    IRE
    -0.61
     Paulo
    -0.60
     Cic
    -0.60
     Chinatown
    -0.59
     LIMITED
    -0.59
    cknow
    -0.58
     Goodman
    -0.57
    POSITIVE LOGITS
    ered
    1.32
    ued
    1.06
    ing
    0.90
    erer
    0.90
    uing
    0.86
    ed
    0.86
    erers
    0.84
    chart
    0.82
    icle
    0.80
    eer
    0.80
    Act Density 0.033%

    No Known Activations