INDEX
    Explanations

    expressions of preference or recommendation

    New Auto-Interp
    Negative Logits
     điển
    -0.62
     Quell
    -0.60
     Efq
    -0.59
     morada
    -0.58
    ZZI
    -0.57
     realizing
    -0.57
     Athenians
    -0.56
    ricos
    -0.56
     Fenn
    -0.55
    źć
    -0.55
    POSITIVE LOGITS
     prefer
    0.76
    ıklı
    0.71
    sidemargin
    0.69
    #
    0.68
     gärna
    0.66
     recommend
    0.64
     дописавши
    0.62
     liever
    0.61
     Wert
    0.60
     faut
    0.60
    Act Density 0.078%

    No Known Activations