INDEX
    Explanations

    phrases suggesting anticipation of future information or events

    references to staying informed or updated

    New Auto-Interp
    Negative Logits
    ãĤ¨ãĥ«
    -0.81
    lain
    -0.70
    perse
    -0.70
     Kod
    -0.66
    adr
    -0.66
    roma
    -0.64
     Paint
    -0.64
    pha
    -0.63
     Combine
    -0.62
     Lak
    -0.61
    POSITIVE LOGITS
     tuned
    1.27
     tuning
    1.21
     tune
    1.06
     Tune
    0.96
     tun
    0.86
     horns
    0.82
     Tun
    0.82
    eness
    0.81
    edo
    0.80
    tun
    0.79
    Act Density 0.010%

    No Known Activations