INDEX
    Explanations

    words indicating direction or movement towards something

    New Auto-Interp
    Negative Logits
     towards
    -0.85
    towards
    -0.72
     Towards
    -0.72
    Towards
    -0.72
     toward
    -0.66
    Toward
    -0.59
     Toward
    -0.59
    toward
    -0.56
     contre
    -0.41
     contra
    -0.41
    POSITIVE LOGITS
    guenos
    0.51
     zijne
    0.51
    haikusbot
    0.48
     sirva
    0.47
     transfieras
    0.47
     pinulongan
    0.47
     Gewähr
    0.46
     pouvoit
    0.45
     damskie
    0.45
     nôtre
    0.45
    Act Density 0.006%

    No Known Activations