INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    IZATION
    0.38
     говорю
    0.37
    skaya
    0.35
    Xamarin
    0.35
    りました
    0.34
    透過
    0.34
    ];
    0.34
    ALT
    0.34
    SARS
    0.34
    //!
    0.33
    POSITIVE LOGITS
     advantage
    1.15
     care
    0.88
    advantage
    0.79
     precautions
    0.71
     liberties
    0.70
     heed
    0.69
     ventaja
    0.64
     turns
    0.64
     aback
    0.64
     charge
    0.63
    Act Density 0.034%

    No Known Activations