INDEX
    Explanations

    higher frequency action verbs and terms indicating change or progression

    New Auto-Interp
    Negative Logits
    azo
    -0.17
    etric
    -0.14
    ymm
    -0.14
     cref
    -0.13
    adelphia
    -0.13
     Lightweight
    -0.13
    aldi
    -0.13
    кÑĥÑĢ
    -0.13
    iero
    -0.13
    ARRANT
    -0.13
    POSITIVE LOGITS
    ä¸Ģä¸ĭ
    0.23
    uling
    0.17
    ometimes
    0.15
    .son
    0.15
     пÑĥнкÑĤ
    0.15
    ink
    0.14
    ing
    0.14
    asis
    0.14
    uate
    0.14
    sett
    0.14
    Act Density 0.006%

    No Known Activations