INDEX
    Explanations

    references to words, vocabulary, and language use

    New Auto-Interp
    Negative Logits
    Życiorys
    -0.61
    IUrlHelper
    -0.58
    {(-
    -0.56
     Partager
    -0.54
    colazione
    -0.54
     RSSSF
    -0.53
     ['./
    -0.52
     poffe
    -0.52
     Lalu
    -0.51
    "]];
    -0.50
    POSITIVE LOGITS
     words
    2.15
     Words
    1.93
     word
    1.83
    Words
    1.82
     WORDS
    1.78
    words
    1.67
     palabra
    1.52
     palabras
    1.51
    word
    1.47
     Word
    1.45
    Act Density 0.236%

    No Known Activations