INDEX
    Explanations

    expressions of positivity and approval

    New Auto-Interp
    Negative Logits
    IBarButtonItem
    -0.84
    FailureListener
    -0.79
     MonoBehaviour
    -0.74
     Guimarães
    -0.70
    TestingModule
    -0.70
     CNT
    -0.70
    olum
    -0.70
     mutagen
    -0.69
    ărul
    -0.69
     Jérusalem
    -0.68
    POSITIVE LOGITS
     NICE
    1.23
    NICE
    1.10
     Nice
    1.04
     nice
    1.03
    nice
    1.00
    Nice
    0.93
     nicest
    0.89
     nic
    0.88
     nicer
    0.85
    Nasty
    0.78
    Act Density 0.009%

    No Known Activations