INDEX
    Explanations

    words that express a positive sentiment or quality

    New Auto-Interp
    Negative Logits
     myſelf
    -0.78
     defences
    -0.76
    InitVars
    -0.75
     aback
    -0.74
    inflater
    -0.74
     Ostat
    -0.70
    liesslich
    -0.70
     Thine
    -0.70
     loisirs
    -0.69
     tyres
    -0.68
    POSITIVE LOGITS
     wonderful
    1.76
     Wonderful
    1.67
    wonderful
    1.64
    Wonderful
    1.62
     marvelous
    1.22
     WONDER
    1.19
     wondrous
    1.16
     wonderfully
    1.10
     merveilleux
    1.07
     maravilloso
    1.03
    Act Density 0.055%

    No Known Activations