INDEX
    Explanations

    phrases related to decision-making processes and social issues

    New Auto-Interp
    Negative Logits
    athan
    -0.15
    ÐĵÐŀ
    -0.15
    اضÛĮ
    -0.15
    ÛĮØ·
    -0.15
    arpa
    -0.14
    ĺħ
    -0.14
    antu
    -0.14
    quipment
    -0.14
    ibir
    -0.13
    .micro
    -0.13
    POSITIVE LOGITS
     differently
    0.17
     style
    0.16
    oji
    0.16
     Kop
    0.16
    zk
    0.16
    style
    0.15
    anou
    0.15
    269
    0.15
    пÑĢав
    0.15
    et
    0.14
    Act Density 0.248%

    No Known Activations