INDEX
    Explanations

    positive affirmations and acceptances

    New Auto-Interp
    Negative Logits
     የወ
    0.70
     Helm
    0.64
     nontrivial
    0.64
    iosos
    0.64
    𝕦
    0.63
     অভাব
    0.61
    𝓌
    0.61
    conse
    0.60
    gnię
    0.59
    0.59
    POSITIVE LOGITS
     okay
    4.51
     ok
    4.28
     OK
    4.26
     Ok
    3.93
    OK
    3.83
    Ok
    3.81
     Okay
    3.81
     fine
    3.80
    okay
    3.65
     alright
    3.64
    Act Density 0.606%

    No Known Activations