INDEX
    Explanations

    the presence of names or identifiers in the text

    New Auto-Interp
    Negative Logits
     splits
    -0.15
    çĸ
    -0.15
    zier
    -0.15
     tick
    -0.14
    iew
    -0.14
    iverse
    -0.14
    ummy
    -0.14
    ÄŁan
    -0.13
    ersonic
    -0.13
    ohan
    -0.13
    POSITIVE LOGITS
    kke
    0.16
    inki
    0.15
    776
    0.15
    ours
    0.14
    okit
    0.14
    çĤİ
    0.14
    /Dk
    0.14
    ansi
    0.14
     Castro
    0.14
     Patton
    0.14
    Act Density 0.106%

    No Known Activations