INDEX
    Explanations

    character descriptions and development

    New Auto-Interp
    Negative Logits
    ح
    1.53
    การ
    1.38
    س
    1.38
    ك
    1.35
     are
    1.27
    1.23
    1.21
    1.20
    ف
    1.18
    いち
    1.14
    POSITIVE LOGITS
    al
    1.31
    ě
    1.27
    d
    1.06
    el
    1.03
    az
    1.02
     Character
    1.01
    st
    0.98
     
    0.98
     characters
    0.95
    il
    0.93
    Act Density 0.071%

    No Known Activations