INDEX
    Explanations

    elements related to sentencing and language structure

    New Auto-Interp
    Negative Logits
     Warburton
    -0.81
    первых
    -0.73
    dolu
    -0.73
    ofire
    -0.71
     חיצוניים
    -0.70
    icoot
    -0.69
    chaikovsky
    -0.67
    Warna
    -0.66
     nicio
    -0.66
    AutoScaleMode
    -0.65
    POSITIVE LOGITS
     sentences
    1.59
     sentence
    1.54
     Sentence
    1.51
     Sentences
    1.36
    sentences
    1.31
    Sentence
    1.27
    sentence
    1.22
     sentenced
    1.03
     frase
    0.99
     Cune
    0.94
    Act Density 0.136%

    No Known Activations