INDEX
    Explanations

    the word "take" and its variations in various contexts

    New Auto-Interp
    Negative Logits
    rowse
    -0.18
    uen
    -0.16
    ears
    -0.15
    armor
    -0.15
    simd
    -0.14
    ouser
    -0.14
    pend
    -0.14
    ç
    -0.14
     пÑĢез
    -0.14
    plr
    -0.14
    POSITIVE LOGITS
    ibold
    0.17
    otos
    0.15
    atom
    0.15
    idl
    0.15
     Zwe
    0.14
    구
    0.14
    ffiti
    0.14
    .Factory
    0.13
     paso
    0.13
    bk
    0.13
    Act Density 0.067%

    No Known Activations