INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     propOrder
    -0.71
     Bruce
    -0.55
     مشين
    -0.54
    Bruce
    -0.53
     Shores
    -0.53
    ſelves
    -0.51
     Clement
    -0.51
    JAMIN
    -0.51
    skyl
    -0.51
     bruce
    -0.50
    POSITIVE LOGITS
    OfWork
    0.77
    o
    0.74
    ه
    0.71
    e
    0.71
    y
    0.69
    i
    0.69
    a
    0.63
    ovi
    0.58
    u
    0.56
    yq
    0.56
    Act Density 0.904%

    No Known Activations