INDEX
    Explanations

    negative sentiment

    New Auto-Interp
    Negative Logits
     walk
    -1.02
     terrible
    -0.96
     awful
    -0.90
    terrible
    -0.83
     horrible
    -0.79
     dreadful
    -0.77
     buried
    -0.77
    Terrible
    -0.71
    als
    -0.65
     horribly
    -0.64
    POSITIVE LOGITS
     protoimpl
    0.62
    DoubleQuotes
    0.59
     Andromeda
    0.58
    incar
    0.55
    onymy
    0.55
     pleaſure
    0.55
    αρα
    0.54
     jadx
    0.54
     spind
    0.54
     overriding
    0.53
    Act Density 0.028%

    No Known Activations