INDEX
    Explanations

    phrases related to attracting people or things

    the word "attract" and its variations in various contexts

    New Auto-Interp
    Negative Logits
     Oops
    -0.66
     procedure
    -0.64
    hal
    -0.62
     surviving
    -0.60
     forearm
    -0.58
     vault
    -0.57
    jab
    -0.57
    itcher
    -0.57
    miah
    -0.56
    chens
    -0.56
    POSITIVE LOGITS
     attention
    0.82
    GGGGGGGG
    0.80
     attract
    0.78
    weights
    0.77
    entious
    0.76
    kefeller
    0.75
     attracts
    0.74
     attracted
    0.74
    tails
    0.72
    dinand
    0.71
    Act Density 0.036%

    No Known Activations