INDEX
    Explanations

    instances and variations of the word "for" in different contexts

    New Auto-Interp
    Negative Logits
    ix
    -0.16
     confirmation
    -0.14
     background
    -0.14
    set
    -0.14
     hol
    -0.14
    ono
    -0.14
     trap
    -0.14
    obs
    -0.14
     Bieber
    -0.14
    PT
    -0.13
    POSITIVE LOGITS
    erah
    0.20
    ocht
    0.17
    riere
    0.15
    ÑĤоÑĦ
    0.15
    cher
    0.15
    ẫn
    0.15
    -article
    0.15
    icher
    0.15
    wner
    0.15
    edith
    0.15
    Act Density 0.010%

    No Known Activations