INDEX
    Explanations

    Question and answering

    New Auto-Interp
    Negative Logits
     τελ
    -0.07
    Formatting
    -0.07
     گرف
    -0.07
    .break
    -0.06
     safest
    -0.06
    にお
    -0.06
     getText
    -0.06
     activism
    -0.06
    orphic
    -0.06
     story
    -0.06
    POSITIVE LOGITS
    ğu
    0.07
    proto
    0.07
     sürekli
    0.07
     WAS
    0.06
     traditional
    0.06
    0.06
     şans
    0.06
    PGA
    0.06
     rund
    0.06
    ".↵↵
    0.06
    Act Density 0.002%

    No Known Activations