INDEX
    Explanations

    specific questions and punctuation marks to identify queries and various sentence structures

    New Auto-Interp
    Negative Logits
     all
    -0.16
    ellen
    -0.16
    utton
    -0.15
    ازÛĮ
    -0.15
    jure
    -0.15
    pora
    -0.15
     none
    -0.14
    å·¡
    -0.14
    101
    -0.14
    ani
    -0.14
    POSITIVE LOGITS
     Both
    0.28
     BOTH
    0.28
     обо
    0.27
    Both
    0.25
     both
    0.24
    both
    0.23
    _BOTH
    0.21
     ambos
    0.21
    _both
    0.21
     beide
    0.21
    Act Density 0.526%

    No Known Activations