INDEX
    Explanations

    "personal" before categories

    New Auto-Interp
    Negative Logits
    provide
    0.48
    Па
    0.44
    a
    0.41
    خ
    0.40
    Бе
    0.39
    0.39
    Су
    0.39
    ה
    0.39
    Ж
    0.38
     bless
    0.38
    POSITIVE LOGITS
     personal
    1.16
     Personal
    1.02
    personal
    0.98
    Personal
    0.96
     PERSONAL
    0.93
     persoonlijke
    0.90
     개인
    0.89
     पर्सनल
    0.85
     προσωπ
    0.84
     व्यक्तिगत
    0.83
    Act Density 0.031%

    No Known Activations