INDEX
    Explanations

    conditional or contrasting phrases

    New Auto-Interp
    Negative Logits
    aurus
    -0.16
    isz
    -0.16
    hip
    -0.14
    latin
    -0.14
     Ñĥмов
    -0.14
     Socorro
    -0.14
    æĽ
    -0.13
    endi
    -0.13
    erate
    -0.13
    iate
    -0.13
    POSITIVE LOGITS
     Hod
    0.17
    uga
    0.16
     Pon
    0.15
     ALSO
    0.15
    ãĥ¼ãĥª
    0.15
    fen
    0.14
    acen
    0.14
    йн
    0.14
     also
    0.14
    дам
    0.14
    Act Density 0.215%

    No Known Activations