INDEX
    Explanations

    numerical ratings and scores

    New Auto-Interp
    Negative Logits
    oders
    -0.17
    ä¸Ī
    -0.15
     ÄĮer
    -0.15
    asal
    -0.15
    elim
    -0.15
     лÑĮ
    -0.15
     ?><?
    -0.14
    rowsable
    -0.14
     sujet
    -0.14
    unas
    -0.14
    POSITIVE LOGITS
     score
    0.20
     overall
    0.18
     Score
    0.17
     Overall
    0.17
     scores
    0.17
    rating
    0.15
     Scores
    0.15
    iy
    0.15
    å̤
    0.15
    ëŀĢ
    0.15
    Act Density 0.069%

    No Known Activations