INDEX
    Explanations

    words related to penalties or punishment

    references to penalization or punishment

    New Auto-Interp
    Negative Logits
    aeda
    -0.81
    through
    -0.73
    worth
    -0.72
    quickShipAvailable
    -0.70
    elf
    -0.68
    lycer
    -0.66
    overs
    -0.65
    afety
    -0.64
    ynthesis
    -0.64
     RIS
    -0.64
    POSITIVE LOGITS
    ized
    1.21
     penal
    1.07
    ised
    0.96
    izes
    0.93
    izing
    0.92
    ization
    0.91
    ising
    0.81
    ize
    0.79
    eties
    0.78
     punished
    0.76
    Act Density 0.019%

    No Known Activations