INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Ramen
    -0.75
     Friedman
    -0.73
    ("</
    -0.71
    ığınız
    -0.71
    getPage
    -0.71
    preise
    -0.70
    شور
    -0.69
     škole
    -0.69
    Agua
    -0.69
     learned
    -0.68
    POSITIVE LOGITS
     ORA
    0.81
     Atalanta
    0.79
     Pyro
    0.78
     blackjack
    0.76
     PREF
    0.72
     Ат
    0.70
     Atkins
    0.70
     Trophy
    0.69
     Sampler
    0.69
     fum
    0.69
    Act Density 0.011%

    No Known Activations