INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    t
    0.72
     
    0.70
    س
    0.64
    ONS
    0.63
    “,
    0.62
    0.62
    ه
    0.61
    Q
    0.61
    0.61
    _
    0.60
    POSITIVE LOGITS
     Burnu
    0.63
     médias
    0.62
    on
    0.61
    ikian
    0.59
    kowej
    0.59
     Promised
    0.59
     fondness
    0.58
    ikia
    0.58
     Jem
    0.57
     निष्कर्ष
    0.57
    Act Density 0.029%

    No Known Activations