INDEX
    Explanations

    references to systems, orders, and general conditions in discussions

    Follows discourse markers or punctuation

    first person explanation

    New Auto-Interp
    Negative Logits
     my
    -1.13
     minha
    -1.01
    我的
    -0.98
    my
    -0.97
     mijn
    -0.96
     meu
    -0.95
     Mijn
    -0.94
     minhas
    -0.92
     meine
    -0.90
     meus
    -0.89
    POSITIVE LOGITS
     we
    1.60
     I
    1.50
     We
    1.11
    we
    1.09
    We
    1.06
     я
    0.95
     мы
    0.84
    I
    0.77
     WE
    0.76
    我就
    0.72
    Act Density 0.812%

    No Known Activations