INDEX
    Explanations

    phrases indicating choices or decision-making situations

    New Auto-Interp
    Negative Logits
     even
    -0.16
    ettel
    -0.15
     Mund
    -0.15
    hd
    -0.15
     Even
    -0.15
    ito
    -0.14
     reverse
    -0.14
     Tanner
    -0.14
    arris
    -0.14
     Gulf
    -0.14
    POSITIVE LOGITS
     alike
    0.23
     Äijá»ģu
    0.18
    Regardless
    0.18
    uniform
    0.15
     always
    0.15
     hepsi
    0.15
    본
    0.15
    akov
    0.15
    á¿¶
    0.15
     uniformly
    0.15
    Act Density 0.108%

    No Known Activations