INDEX
    Explanations

    introducing today's topic

    New Auto-Interp
    Negative Logits
     každ
    0.39
     každý
    0.38
    0.38
     każdy
    0.37
    INCLUDE
    0.36
    ्यारह
    0.36
    0.35
    丁寧
    0.35
    াহীন
    0.35
    ዎቹ
    0.35
    POSITIVE LOGITS
     Topic
    0.57
     topic
    0.56
     theme
    0.55
     tenemos
    0.54
    今天要
    0.54
    Topic
    0.53
     موضوع
    0.52
     Thema
    0.52
     Theme
    0.52
     tratto
    0.50
    Act Density 0.028%

    No Known Activations