INDEX
    Explanations

    expressions of strong affection or admiration

    New Auto-Interp
    Negative Logits
    ̧
    -0.17
    awy
    -0.15
    rech
    -0.15
     karak
    -0.15
    canf
    -0.15
    @student
    -0.14
    acie
    -0.14
    immel
    -0.14
    solete
    -0.14
    regist
    -0.13
    POSITIVE LOGITS
    idge
    0.17
    Bounding
    0.17
    itty
    0.17
    ahy
    0.15
    eza
    0.15
    Traits
    0.14
    arus
    0.14
    'gc
    0.14
    IPA
    0.14
    ataka
    0.14
    Act Density 0.023%

    No Known Activations