INDEX
    Explanations

    verbs that suggest observation or learning

    New Auto-Interp
    Negative Logits
    ysl
    -0.17
    irm
    -0.15
    iar
    -0.15
    ackers
    -0.15
    ida
    -0.15
    _basis
    -0.14
    oir
    -0.14
    AMY
    -0.14
    us
    -0.14
     tank
    -0.14
    POSITIVE LOGITS
    ton
    0.17
    owell
    0.15
    utow
    0.15
    ittel
    0.15
     ton
    0.15
    enÄĽ
    0.14
    igs
    0.14
    ufs
    0.14
    itzer
    0.14
    nable
    0.14
    Act Density 0.000%

    No Known Activations