INDEX
    Explanations

    words related to intense actions or states of being

    New Auto-Interp
    Negative Logits
    abor
    -0.17
    arium
    -0.17
    aman
    -0.16
    fu
    -0.16
    onen
    -0.16
    abil
    -0.16
    ensi
    -0.15
    fa
    -0.15
    aki
    -0.15
    él
    -0.15
    POSITIVE LOGITS
    ught
    0.26
    UGHT
    0.23
    ughter
    0.22
    INT
    0.22
    int
    0.21
    unch
    0.21
    ints
    0.21
    ìŀħ
    0.20
    unting
    0.20
    Ñĥнд
    0.20
    Act Density 0.051%

    No Known Activations