INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     douceur
    -0.88
    込んで
    -0.77
     devoir
    -0.75
    トイレ
    -0.72
    込んだ
    -0.71
     شهرستان
    -0.71
    َي
    -0.71
     davvero
    -0.69
    טרה
    -0.68
    siery
    -0.68
    POSITIVE LOGITS
     state
    5.13
     states
    4.03
     State
    3.94
    state
    3.75
    State
    3.47
    STATE
    3.23
     estado
    3.22
     STATE
    3.16
    状态
    3.13
    states
    3.03
    Act Density 0.055%

    No Known Activations