INDEX
    Explanations

    single words and phrases related to rebellion or opposition

    references to "counter" concepts or movements

    New Auto-Interp
    Negative Logits
    OOD
    -0.67
     goodbye
    -0.64
     Forge
    -0.63
     Franks
    -0.61
     Finch
    -0.61
     Slime
    -0.60
     Tornado
    -0.60
     Bram
    -0.60
    icity
    -0.59
     Bib
    -0.58
    POSITIVE LOGITS
    measures
    1.47
    balance
    1.38
    intuitive
    1.38
    attack
    1.31
    fact
    1.25
    clock
    1.24
    culture
    1.23
    offensive
    1.22
    intelligence
    1.22
    cultural
    1.21
    Act Density 0.019%

    No Known Activations