INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     不过
    0.44
     ذریع
    0.43
    িনবার্গ
    0.42
     pięk
    0.42
     คง
    0.41
     cómod
    0.40
     conç
    0.40
     puisqu
    0.40
    ђено
    0.39
     conçu
    0.38
    POSITIVE LOGITS
     there
    1.11
    there
    0.94
     someone
    0.90
     it
    0.89
    0.79
     해당
    0.78
     somebody
    0.77
     the
    0.75
    someone
    0.73
     هناك
    0.72
    Act Density 0.010%

    No Known Activations