INDEX
    Explanations

    phrases containing conversational elements or politeness markers

    New Auto-Interp
    Negative Logits
    -0.54
     L
    -0.54
     who
    -0.50
     (
    -0.50
     didn
    -0.49
     \
    -0.49
     isn
    -0.49
     As
    -0.48
     H
    -0.48
     Who
    -0.47
    POSITIVE LOGITS
     Wikimedijinoj
    1.15
     disambiguazione
    1.00
    Rüyada
    0.99
    )++;
    0.99
    Життєпис
    0.98
     ModelExpression
    0.96
     AssemblyVersion
    0.93
    RTEE
    0.93
    MessageTagHelper
    0.91
     useContext
    0.91
    Act Density 0.052%

    No Known Activations