INDEX
    Explanations

    references to influential figures and their impacts on society

    New Auto-Interp
    Negative Logits
     iſt
    -1.07
     ſind
    -1.00
     itſelf
    -0.98
    -0.93
    ſelf
    -0.92
    numerusform
    -0.90
     ་་
    -0.90
    .",
    
    -0.88
     AppColors
    -0.87
    </caption>
    -0.87
    POSITIVE LOGITS
     stuff
    0.92
     maybe
    0.81
     I
    0.80
     kinda
    0.77
     my
    0.76
     mierda
    0.73
     наверное
    0.73
     crappy
    0.73
    とか
    0.72
     &
    0.71
    Act Density 2.234%

    No Known Activations