INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ntö
    -0.40
     Catto
    -0.35
     zá
    -0.35
    Parameteri
    -0.35
     Zus
    -0.35
     veiks
    -0.35
    хьтан
    -0.35
     uden
    -0.35
     Vikipedi
    -0.34
     '\\;'
    -0.34
    POSITIVE LOGITS
     blow
    1.34
    blow
    1.31
     blew
    1.27
    Blow
    1.27
     blown
    1.26
     Blow
    1.25
     blowing
    1.20
    blowing
    1.10
    blown
    1.08
     blows
    1.05
    Act Density 0.010%

    No Known Activations