INDEX
    Explanations

    instances of the word "in."

    New Auto-Interp
    Negative Logits
    ä¸ĭåİ»
    -0.16
    fang
    -0.15
    ignite
    -0.14
    vert
    -0.13
    ãĥ¥
    -0.13
    à¸IJาà¸Ļ
    -0.13
    urve
    -0.13
    lage
    -0.13
    WhiteSpace
    -0.13
    ju
    -0.12
    POSITIVE LOGITS
     tow
    0.39
     play
    0.32
     sight
    0.32
     store
    0.31
     mind
    0.29
     reserve
    0.29
     place
    0.28
     attendance
    0.24
    play
    0.24
     hand
    0.23
    Act Density 0.153%

    No Known Activations