INDEX
    Explanations

    words related to impacts or interactions, with a particular focus on actions or events having a strong effect or outcome

    New Auto-Interp
    Negative Logits
    pires
    -0.66
    ç«
    -0.64
    href
    -0.63
    Loading
    -0.63
    ais
    -0.62
    otype
    -0.60
    atu
    -0.60
    åŃ
    -0.59
    agin
    -0.59
    æĥ
    -0.59
    POSITIVE LOGITS
    ched
    1.11
    achi
    0.88
    boxes
    0.87
    ches
    0.84
    ting
    0.81
    ted
    0.81
    waves
    0.79
     hardest
    0.79
    pell
    0.78
     puberty
    0.76
    Act Density 2.362%

    No Known Activations