INDEX
    Explanations

    numeric values followed by currency symbols or measurements

    New Auto-Interp
    Negative Logits
     itſelf
    -0.92
     ſind
    -0.88
    "';
    -0.86
     Efq
    -0.83
    ]),
    
    -0.83
     auffi
    -0.82
     ་་
    -0.82
     iſt
    -0.82
     Asimismo
    -0.81
    }';
    -0.80
    POSITIVE LOGITS
     +
    0.67
    +
    0.67
     I
    0.65
     thing
    0.65
     or
    0.64
     crappy
    0.64
     %
    0.62
     @
    0.61
     whatever
    0.61
     &
    0.60
    Act Density 0.257%

    No Known Activations