INDEX
    Explanations

    pronouns and character names

    New Auto-Interp
    Negative Logits
     WITH
    0.52
     ihrer
    0.51
     THEIR
    0.48
     their
    0.47
    他们的
    0.47
     themselves
    0.46
    0.46
    他們
    0.45
     Их
    0.45
     त्यांच्या
    0.44
    POSITIVE LOGITS
    fifty
    0.47
    akke
    0.45
     няколко
    0.42
     fifty
    0.40
    midt
    0.40
    zelf
    0.38
     два
    0.38
    fy
    0.37
    0.37
     რამდენ
    0.37
    Act Density 0.012%

    No Known Activations