INDEX
    Explanations

    question types and descriptions

    New Auto-Interp
    Negative Logits
     answers
    0.40
     بدون
    0.39
     دون
    0.39
     المطل
    0.39
     bith
    0.39
     Diva
    0.37
     без
    0.37
    0.36
    swers
    0.36
     cosm
    0.36
    POSITIVE LOGITS
    RQ
    0.57
    dq
    0.48
    cq
    0.48
     CQ
    0.46
    rq
    0.45
    qt
    0.45
    qr
    0.44
    TQ
    0.44
    CQ
    0.42
    QC
    0.42
    Act Density 0.001%

    No Known Activations