INDEX
    Explanations

    relationships and dependencies within sentences

    New Auto-Interp
    Negative Logits
       
    -0.15
     çünkü
    -0.14
    æĥij
    -0.13
    Îŀ
    -0.13
    ngen
    -0.13
    ilyn
    -0.13
    iki
    -0.13
    orgen
    -0.13
    ;
    -0.13
    ns
    -0.13
    POSITIVE LOGITS
     же
    0.20
     his
    0.18
    ï¼Įä»ĸ
    0.17
     ìĿ´ëĬĶ
    0.17
     {},
    0.16
     _______,
    0.16
     [],
    0.16
     ìĿ´ë٬íķľ
    0.16
    /her
    0.15
     their
    0.15
    Act Density 0.508%

    No Known Activations