INDEX
    Explanations

    words related to physical harm or attack

    instances of certain names and terms related to entities or characters

    New Auto-Interp
    Negative Logits
    omen
    -0.80
    athlon
    -0.76
    ldon
    -0.75
    ples
    -0.75
     handshake
    -0.73
    etermination
    -0.73
    ally
    -0.73
    emet
    -0.72
    icrobial
    -0.72
    EStreamFrame
    -0.72
    POSITIVE LOGITS
     Canaver
    0.82
     Beng
    0.74
    glers
    0.74
     Sebast
    0.74
     ABE
    0.73
     Kuala
    0.70
     Sebastian
    0.69
     Pengu
    0.65
    Footnote
    0.65
     Ops
    0.64
    Act Density 0.033%

    No Known Activations