INDEX
    Explanations

    positive descriptors and adjectives that convey approval or enhancement

    New Auto-Interp
    Negative Logits
    ViewFeatures
    -0.71
    клопе
    -0.70
    شهاد
    -0.68
     itſelf
    -0.66
    ειτουργ
    -0.66
    NewUrlParser
    -0.64
    QMetaType
    -0.62
     חיצוניים
    -0.62
     kaynağından
    -0.61
     Monfieur
    -0.60
    POSITIVE LOGITS
    <eos>
    0.61
    ']").
    0.59
    ']}
    0.57
    )
    
    
    0.56
     )}
    0.52
     pros
    0.52
    0.51
    }}"
    0.50
    `),
    0.49
    *{\
    0.49
    Act Density 0.517%

    No Known Activations