INDEX
    Explanations

    positive adjectives and qualifiers that indicate a favorable evaluation

    New Auto-Interp
    Negative Logits
      
    -0.35
    didReceive
    -0.30
    ↵↵
    -0.27
     negó
    -0.27
     proyección
    -0.26
     -
    -0.25
     comunicación
    -0.23
    userId
    -0.23
    说不
    -0.23
     return
    -0.23
    POSITIVE LOGITS
    AndEndTag
    0.87
    новништво
    0.87
    ロウィン
    0.81
     Taktlose
    0.79
    WriteTagHelper
    0.79
     EconPapers
    0.78
    InitVars
    0.77
    sidemargin
    0.77
    SharedCtor
    0.76
    ConstraintMaker
    0.75
    Act Density 0.125%

    No Known Activations