INDEX
    Explanations

    references to citations and authors in academic contexts

    New Auto-Interp
    Negative Logits
     İnsan
    -0.20
     Düny
    -0.17
    Ãį
    -0.17
     Bölüm
    -0.16
     Bazı
    -0.16
     MÃ¼ÅŁ
    -0.16
     áº
    -0.16
     Onun
    -0.15
     Ãĸzellikle
    -0.15
     İst
    -0.15
    POSITIVE LOGITS
     Oz
    0.24
     Erd
    0.23
    ̧
    0.22
     Dog
    0.22
     Kurt
    0.21
     Alt
    0.21
     Nec
    0.20
     Ser
    0.20
    erdem
    0.20
    orman
    0.20
    Act Density 0.028%

    No Known Activations