INDEX
    Explanations

    mentions of the word "Gram" and its related forms, indicating a focus on measurements and metrics

    New Auto-Interp
    Negative Logits
    657
    -0.17
     ↵↵
    -0.17
    uien
    -0.16
    gue
    -0.16
    utters
    -0.15
    ¼åIJĪ
    -0.15
    tuÄŁ
    -0.15
    eft
    -0.15
    ÑĤо
    -0.14
    bane
    -0.14
    POSITIVE LOGITS
    ophone
    0.35
    à¥Ģण
    0.28
    mys
    0.28
    erc
    0.27
    atical
    0.27
    bling
    0.25
    mer
    0.25
    matic
    0.25
    sci
    0.24
    atically
    0.23
    Act Density 0.005%

    No Known Activations