INDEX
    Explanations

    phrases related to directions or trajectories

    concepts related to direction or guidance

    New Auto-Interp
    Negative Logits
    enty
    -0.75
    roma
    -0.73
    athered
    -0.73
    nikov
    -0.72
    esters
    -0.68
    reditary
    -0.68
    ammy
    -0.67
    bley
    -0.67
    akov
    -0.66
    unker
    -0.66
    POSITIVE LOGITS
     direction
    1.18
    ality
    1.17
     directions
    1.11
     towards
    0.99
     toward
    0.99
    finder
    0.94
    finding
    0.92
    ally
    0.87
    posts
    0.84
    eering
    0.84
    Act Density 0.046%

    No Known Activations