INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     transaction
    0.42
     axles
    0.42
     elevators
    0.40
     disks
    0.40
     drugstore
    0.39
     तेंदुलकर
    0.38
    ättre
    0.38
     crowding
    0.38
    0.38
    mathrm
    0.38
    POSITIVE LOGITS
    i
    0.50
     терпе
    0.47
    ness
    0.44
    Science
    0.43
    spring
    0.43
    Soy
    0.42
    Baik
    0.42
    Malay
    0.42
     Yuki
    0.41
    Prot
    0.41
    Act Density 0.001%

    No Known Activations