INDEX
    Explanations

    phrases related to movement or progress

    phrases indicating movement or progress towards a goal

    New Auto-Interp
    Negative Logits
    uster
    -0.76
    ropolitan
    -0.69
    usters
    -0.69
     livest
    -0.67
    tein
    -0.62
    itton
    -0.61
    iasco
    -0.60
     Tuc
    -0.60
    icion
    -0.60
    nect
    -0.59
    POSITIVE LOGITS
    fare
    1.20
    ward
    0.95
    finding
    0.88
    WARD
    0.77
     toward
    0.77
    step
    0.76
    steps
    0.75
    finder
    0.73
     towards
    0.72
    seeing
    0.71
    Act Density 0.023%

    No Known Activations