INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     srf
    -0.81
    emaker
    -0.76
    Þ
    -0.73
    İĭ
    -0.72
    ailability
    -0.68
     rall
    -0.67
    ModLoader
    -0.66
     skelet
    -0.66
    ¥ŀ
    -0.66
     obser
    -0.65
    POSITIVE LOGITS
     Fuck
    0.97
     Nobody
    0.91
     Nothing
    0.86
     Anyway
    0.85
     Aren
    0.82
     Everybody
    0.82
     Actually
    0.81
    fuck
    0.81
     Whenever
    0.81
     Nope
    0.80
    Act Density 0.027%

    No Known Activations