INDEX
    Explanations

    words and abbreviations that express opinions or judgement, especially positive ones

    internet comments/forum snippets

    New Auto-Interp
    Negative Logits
     كومونز
    -0.93
     itſelf
    -0.74
    ViewFeatures
    -0.71
    parsedMessage
    -0.70
    InvalidProtocol
    -0.68
     myſelf
    -0.66
    InputBorder
    -0.66
    __':
    
    -0.65
    ſelves
    -0.65
    ()")
    -0.64
    POSITIVE LOGITS
     beautiful
    1.04
    beautiful
    0.99
    Beautiful
    0.99
     lovely
    0.96
     wonderful
    0.93
     nice
    0.93
     gorgeous
    0.93
     Beautiful
    0.90
    Lovely
    0.88
    Wonderful
    0.87
    Act Density 0.341%

    No Known Activations