INDEX
    Explanations

    terms indicating negation or non-existence

    New Auto-Interp
    Negative Logits
     Tycoon
    -0.98
     Franks
    -0.76
    ãĤ¼ãĤ¦ãĤ¹
    -0.68
     Grind
    -0.65
     hordes
    -0.65
     Rooms
    -0.64
     Dug
    -0.64
     Halls
    -0.64
    Kings
    -0.64
     Spoon
    -0.62
    POSITIVE LOGITS
    chal
    1.20
    stop
    1.10
    linear
    1.07
    etheless
    1.07
    verbal
    1.05
    epad
    1.04
    fiction
    1.02
    threatening
    1.02
    profit
    1.01
    zero
    1.00
    Act Density 0.018%

    No Known Activations