INDEX
    Explanations

    the word "fun" or similar variations indicating enjoyment or pleasure

    references to enjoyment or amusement

    New Auto-Interp
    Negative Logits
     Hoover
    -0.69
     Gork
    -0.68
     underest
    -0.68
     Hawth
    -0.66
     Hein
    -0.64
     Canter
    -0.64
     fright
    -0.62
     Horton
    -0.61
     crush
    -0.60
     ib
    -0.60
    POSITIVE LOGITS
    ctions
    1.68
    func
    1.11
    fun
    1.07
    ancial
    1.05
    eral
    1.00
    rontal
    0.92
    cture
    0.90
    ctory
    0.90
    enges
    0.88
    aunder
    0.87
    Act Density 0.014%

    No Known Activations