INDEX
    Explanations

    references to villains in various contexts

    references to villains and their destructive traits or actions

    New Auto-Interp
    Negative Logits
    aeda
    -0.65
     HELP
    -0.63
     authenticated
    -0.61
     OFFIC
    -0.60
     coerc
    -0.59
    ERSON
    -0.59
     livest
    -0.58
     cooperative
    -0.55
    ITNESS
    -0.55
     earners
    -0.54
    POSITIVE LOGITS
    ous
    2.84
    ously
    2.58
    OUS
    1.77
    osity
    1.43
    uously
    1.35
    ized
    1.32
    izing
    1.31
    ising
    1.29
    istic
    1.27
    iously
    1.25
    Act Density 0.096%

    No Known Activations