INDEX
    Explanations

    references to 'points' or key ideas within a discussion or argument

    New Auto-Interp
    Negative Logits
    afs
    -0.15
    彦
    -0.15
    bsolute
    -0.14
    ajas
    -0.14
    ths
    -0.14
    thur
    -0.14
    hwnd
    -0.14
    ¤í
    -0.14
    oulos
    -0.14
    him
    -0.14
    POSITIVE LOGITS
    blank
    0.33
    lessly
    0.31
    ill
    0.30
    y
    0.29
     blank
    0.28
    lessness
    0.28
    Blank
    0.27
     Blank
    0.27
    -of
    0.27
    edly
    0.26
    Act Density 0.050%

    No Known Activations