INDEX
    Explanations

    unusual or non-standard characters or symbols in the text

    New Auto-Interp
    Negative Logits
    ì¦Ŀ
    -0.15
    ког
    -0.14
     camps
    -0.14
    erville
    -0.14
    nex
    -0.14
    à¥įयत
    -0.14
    ãĥĪãĥª
    -0.14
    ecome
    -0.14
    ancel
    -0.13
    atos
    -0.13
    POSITIVE LOGITS
     conf
    0.22
    -conf
    0.21
    /conf
    0.20
    conf
    0.19
     Conf
    0.19
     Confeder
    0.19
     CONF
    0.18
    hlen
    0.18
    Conf
    0.17
    _conf
    0.17
    Act Density 0.008%

    No Known Activations