INDEX
    Explanations

    phrases related to direction or movement

    New Auto-Interp
    Negative Logits
    ollar
    -0.62
    manship
    -0.60
    pots
    -0.60
     lett
    -0.58
    iqueness
    -0.58
    ãĤ¦ãĤ¹
    -0.57
    pot
    -0.56
     situational
    -0.56
    Ability
    -0.56
     Actual
    -0.56
    POSITIVE LOGITS
     towards
    0.99
     toward
    0.98
     stairs
    0.98
     downhill
    0.95
    Ô
    0.94
     unnoticed
    0.94
     corridors
    0.88
    wards
    0.86
     blindly
    0.85
     unch
    0.85
    Act Density 2.739%

    No Known Activations