INDEX
    Explanations

    phrases that indicate enjoyment or fun

    instances of the word "fun" in various contexts

    New Auto-Interp
    Negative Logits
     bottleneck
    -0.72
     Centauri
    -0.69
    BOOK
    -0.63
    attle
    -0.62
     underest
    -0.61
     proport
    -0.61
    obook
    -0.59
     Transparency
    -0.59
     helicop
    -0.59
     defective
    -0.59
    POSITIVE LOGITS
    nels
    1.57
    gal
    1.14
    eral
    1.09
    nell
    1.09
    nel
    1.02
    ctor
    0.93
    ctory
    0.86
    ilee
    0.86
    ctions
    0.83
    imation
    0.83
    Act Density 0.025%

    No Known Activations