INDEX
    Explanations

    phrases describing changes in numbers or values

    New Auto-Interp
    Negative Logits
    sonian
    -0.67
    ardi
    -0.63
    uminati
    -0.60
    sama
    -0.60
    ologies
    -0.59
    ician
    -0.58
     Pastebin
    -0.58
    exe
    -0.58
    Ide
    -0.56
    _-
    -0.56
    POSITIVE LOGITS
     downhill
    1.14
     up
    1.06
     DOWN
    0.97
     backwards
    0.97
     downwards
    0.94
     down
    0.94
     upwards
    0.93
     sideways
    0.90
     upward
    0.88
     steadily
    0.86
    Act Density 0.046%

    No Known Activations