INDEX
    Explanations

    concepts related to crime and harm analysis

    New Auto-Interp
    Negative Logits
    elas
    -0.15
    enet
    -0.15
    oice
    -0.14
    raith
    -0.14
    legen
    -0.14
    enso
    -0.14
    bbbb
    -0.13
     magna
    -0.13
    kil
    -0.13
    schemas
    -0.13
    POSITIVE LOGITS
    ãĥį
    0.15
    ibr
    0.15
    ÏĨÏħ
    0.14
     argument
    0.13
    utor
    0.13
    åľŃ
    0.13
    amo
    0.13
    ioni
    0.13
    à¤Ĥà¤ļ
    0.13
    opr
    0.13
    Act Density 0.058%

    No Known Activations