INDEX
    Explanations

    references to specific years and historical milestones

    New Auto-Interp
    Negative Logits
    lisi
    -0.17
    並
    -0.14
    rei
    -0.14
    raud
    -0.14
    aser
    -0.14
    aru
    -0.14
     preferredStyle
    -0.14
    ruba
    -0.13
    atives
    -0.13
    utral
    -0.13
    POSITIVE LOGITS
     when
    0.23
    when
    0.21
     cuando
    0.16
     khi
    0.16
     When
    0.15
     عÙĨدÙħا
    0.15
    Ñıд
    0.15
    Ïĥε
    0.15
     ØŃÙĬÙĨ
    0.15
     quando
    0.15
    Act Density 0.069%

    No Known Activations