INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     seen
    0.80
     نقصان
    0.78
     temporary
    0.76
     خواهد
    0.75
    0.74
     簡単
    0.74
     undirected
    0.74
     사용
    0.72
     compatible
    0.72
     indicates
    0.72
    POSITIVE LOGITS
    Oh
    1.91
     Oh
    1.77
     oh
    1.76
    Ah
    1.55
    oh
    1.42
    Honestly
    1.40
     gosh
    1.37
    Wow
    1.35
     OH
    1.32
    Absolutely
    1.31
    Act Density 0.904%

    No Known Activations