INDEX
    Explanations

    negative or critical adjectives

    negative descriptors and critique-related terms

    New Auto-Interp
    Negative Logits
    Downloadha
    -0.82
    iler
    -0.78
    ologue
    -0.73
    udeb
    -0.70
    rongh
    -0.69
    »Ĵ
    -0.69
    opez
    -0.68
    roma
    -0.68
    hover
    -0.67
    onde
    -0.67
    POSITIVE LOGITS
     enough
    1.14
     territory
    0.84
     insanity
    0.84
     storytelling
    0.83
     folly
    0.82
     nonsense
    0.82
    speak
    0.81
     coincidence
    0.81
     fodder
    0.78
     stupidity
    0.78
    Act Density 0.359%

    No Known Activations