INDEX
    Explanations

    occurrences of the word "in."

    New Auto-Interp
    Negative Logits
    671
    -0.17
    vert
    -0.15
    ãĥ¥
    -0.15
    ivel
    -0.14
    iling
    -0.14
    anto
    -0.14
    bat
    -0.14
    aha
    -0.14
    etz
    -0.14
    elt
    -0.13
    POSITIVE LOGITS
     tow
    0.31
     sight
    0.30
     play
    0.25
     reach
    0.23
     hand
    0.21
     Sight
    0.21
     plain
    0.19
    Evidence
    0.18
    ighted
    0.18
     evidence
    0.18
    Act Density 0.161%

    No Known Activations