INDEX
    Explanations

    variables followed by special characters

    New Auto-Interp
    Negative Logits
    ي
    0.61
    0.60
    י
    0.57
    ح
    0.55
    ığı
    0.55
    ushchev
    0.55
    ώ
    0.55
    ται
    0.54
    ılar
    0.54
    ά
    0.53
    POSITIVE LOGITS
     $
    0.86
     as
    0.80
    k
    0.71
    IL
    0.66
     Полу
    0.66
     on
    0.62
    ik
    0.62
    .$
    0.61
    '$
    0.61
    0.61
    Act Density 0.022%

    No Known Activations