INDEX
    Explanations

    phrases related to killing or death

    New Auto-Interp
    Negative Logits
    onto
    -0.19
    vid
    -0.17
     onto
    -0.15
    land
    -0.15
    /out
    -0.15
    ulled
    -0.15
    oi
    -0.15
    eward
    -0.14
    elect
    -0.14
    iland
    -0.14
    POSITIVE LOGITS
    /disable
    0.20
    ibri
    0.19
     spree
    0.19
    deer
    0.19
    çİ°åľº
    0.17
    joy
    0.17
    switch
    0.16
    æĪ
    0.16
    abyrinth
    0.16
    à¥ľ
    0.16
    Act Density 0.057%

    No Known Activations