INDEX
    Explanations

    phrases indicating direction or intention

    New Auto-Interp
    Negative Logits
     kasarigan
    -0.64
     MIC
    -0.59
    riscoll
    -0.57
     Vari
    -0.56
     Sole
    -0.55
    vaux
    -0.55
    Біографія
    -0.54
    Static
    -0.52
     Galle
    -0.51
     Inter
    -0.51
    POSITIVE LOGITS
     towards
    4.20
     toward
    4.12
    towards
    4.01
    toward
    3.90
     Towards
    3.79
     Toward
    3.70
    Towards
    3.63
    Toward
    3.29
     hacia
    2.73
     envers
    2.20
    Act Density 0.046%

    No Known Activations