INDEX
    Explanations

    words related to defiance or going against established norms or rules

    occurrences of the word "def," likely indicating definitions or actions related to defining something

    New Auto-Interp
    Negative Logits
    DAY
    -0.72
     Madness
    -0.71
     Boll
    -0.70
    sth
    -0.70
     Hour
    -0.65
     Elves
    -0.63
     Archdemon
    -0.61
    Reviewer
    -0.61
     clip
    -0.60
     livest
    -0.60
    POSITIVE LOGITS
    ensible
    1.32
    erence
    1.24
    ibr
    1.22
    acement
    1.21
    erent
    1.19
    ected
    1.12
    ocused
    1.12
    aced
    1.09
    ection
    1.09
    amiliar
    1.08
    Act Density 0.015%

    No Known Activations