INDEX
    Explanations

    mentions of weapons

    New Auto-Interp
    Negative Logits
    pace
    -0.92
    weet
    -0.77
    leep
    -0.73
    cess
    -0.71
    ilver
    -0.70
    aways
    -0.67
    gres
    -0.67
    agascar
    -0.67
    hu
    -0.65
    borough
    -0.65
    POSITIVE LOGITS
    ized
    1.09
    ised
    1.07
    ry
    1.06
    izes
    1.05
    izer
    1.02
    ization
    0.98
    izations
    0.93
    iser
    0.91
    ises
    0.91
    isation
    0.89
    Act Density 0.052%

    No Known Activations