INDEX
    Explanations

    targets or entities that are being identified or aimed at

    references to specific objectives or targets

    New Auto-Interp
    Negative Logits
    OVA
    -0.71
     haz
    -0.68
    pole
    -0.66
    cup
    -0.65
    john
    -0.65
     htt
    -0.65
    alus
    -0.63
    uitive
    -0.62
    inct
    -0.62
    inus
    -0.62
    POSITIVE LOGITS
     targets
    4.09
     target
    2.72
    target
    2.09
    Target
    1.83
     targeted
    1.82
     Targ
    1.79
     targeting
    1.74
     Target
    1.70
     targ
    1.66
     objectives
    1.61
    Act Density 0.009%

    No Known Activations