INDEX
    Explanations

    instances of the word "for" in various contexts

    New Auto-Interp
    Negative Logits
    yl
    -0.16
     windowHeight
    -0.16
    rech
    -0.15
    utt
    -0.14
    rippling
    -0.14
    ÑĬ
    -0.13
    íģ¼
    -0.13
    à¥Ĥष
    -0.13
    themes
    -0.13
    ICC
    -0.13
    POSITIVE LOGITS
    ãĥ¼ãĥĭ
    0.20
    ÃĹ↵↵
    0.18
    ermen
    0.17
    å±¥
    0.16
    Ñĵ
    0.15
    omer
    0.14
    sez
    0.14
     "~/
    0.14
    obic
    0.14
    ando
    0.14
    Act Density 0.013%

    No Known Activations