INDEX
    Explanations

    nouns following possessives

    New Auto-Interp
    Negative Logits
    ंबे
    0.39
    :</
    0.37
     দেখুনঃ
    0.36
     intricacies
    0.34
    )・
    0.34
     ядра
    0.33
    ელი
    0.33
    0.33
    الب
    0.33
    ینده
    0.33
    POSITIVE LOGITS
     persönliche
    0.45
     has
    0.44
     son
    0.41
     persön
    0.40
     personal
    0.39
    takes
    0.39
     persoonlijke
    0.39
    个人的
    0.39
    dad
    0.38
     має
    0.38
    Act Density 0.020%

    No Known Activations