INDEX
    Explanations

    references to personal relationships and interactions

    New Auto-Interp
    Negative Logits
     ÙħÙĪØ§Ø·
    -0.16
    ukt
    -0.16
    zon
    -0.14
    998
    -0.14
    itas
    -0.14
    okoj
    -0.14
    742
    -0.14
     Muj
    -0.14
    347
    -0.14
    539
    -0.14
    POSITIVE LOGITS
    ilden
    0.17
    .wr
    0.16
    apel
    0.15
    at
    0.15
    hee
    0.14
     Matte
    0.14
     dikke
    0.14
    ilecek
    0.14
    ichel
    0.14
     det
    0.14
    Act Density 0.002%

    No Known Activations