INDEX
    Explanations

    words related to specific targets or goals

    instances of the word "target" and its related contexts

    New Auto-Interp
    Negative Logits
    UGE
    -0.68
    fo
    -0.68
     Geological
    -0.67
    IGH
    -0.67
    ISTORY
    -0.67
     notor
    -0.62
    ERROR
    -0.62
    plet
    -0.61
    ansk
    -0.61
    AX
    -0.60
    POSITIVE LOGITS
    ted
    1.13
     targets
    0.97
     target
    0.92
    nels
    0.81
    izen
    0.80
    oided
    0.77
    eers
    0.76
    ishi
    0.75
    target
    0.73
    ataka
    0.72
    Act Density 0.016%

    No Known Activations