INDEX
    Explanations

    phrases that indicate alternative perspectives or rephrasings

    New Auto-Interp
    Negative Logits
    inka
    -0.16
    allon
    -0.15
    _bw
    -0.14
    vak
    -0.14
    ses
    -0.14
    ãĥ¼ãĤ
    -0.14
     заÑģÑĤав
    -0.14
    iko
    -0.14
    ntag
    -0.13
    rtl
    -0.13
    POSITIVE LOGITS
     words
    0.46
    words
    0.37
     Words
    0.31
    .words
    0.29
    _words
    0.29
    Words
    0.28
    (words
    0.23
     palabras
    0.21
    wards
    0.20
     Ñģлова
    0.19
    Act Density 0.013%

    No Known Activations