INDEX
    Explanations

    phrases indicating direction or movement towards a goal or endpoint

    New Auto-Interp
    Negative Logits
     Calvo
    -0.84
     foglal
    -0.72
     fatica
    -0.68
     poffe
    -0.67
    -0.67
    5
    -0.67
    T
    -0.65
     Bede
    -0.63
    ыре
    -0.62
     riuscito
    -0.62
    POSITIVE LOGITS
    toward
    1.81
    towards
    1.77
     Towards
    1.75
     Toward
    1.75
     toward
    1.74
    Towards
    1.68
     towards
    1.63
    Toward
    1.61
     hacia
    1.25
     envers
    1.19
    Act Density 0.060%

    No Known Activations