INDEX
    Explanations

    phrases related to direction or guidance

    assertions about directionality, particularly regarding positive or negative trajectories

    New Auto-Interp
    Negative Logits
    Sec
    -0.62
    ulas
    -0.61
     Splash
    -0.60
    çīĪ
    -0.60
    Availability
    -0.58
    arcity
    -0.58
     fame
    -0.58
    yrs
    -0.57
     Instit
    -0.56
     Spa
    -0.56
    POSITIVE LOGITS
     direction
    2.09
     directions
    1.83
     footsteps
    1.31
    direction
    1.22
     Direction
    1.18
     opposite
    1.11
    wards
    0.99
     favor
    0.95
    WARD
    0.93
    ward
    0.92
    Act Density 0.148%

    No Known Activations