INDEX
    Explanations

    references to organizations and their actions

    New Auto-Interp
    Negative Logits
     فريبيس
    -0.80
    batis
    -0.64
    cillor
    -0.64
    XtraGrid
    -0.62
     BoxFit
    -0.60
    =\""
    -0.60
    Diweddarwch
    -0.60
    -0.59
     pinulongan
    -0.59
     تانيه
    -0.58
    POSITIVE LOGITS
    AntiForgeryToken
    0.66
    Handlung
    0.65
    0.61
     recently
    0.52
    <bos>
    0.52
    .
    0.51
    [toxicity=0]
    0.51
     sibi
    0.50
     broadly
    0.50
    ,
    0.49
    Act Density 0.251%

    No Known Activations