INDEX
    Explanations

    describing extent or modification

    New Auto-Interp
    Negative Logits
    ulated
    0.48
    							
    0.39
    0.39
    irmed
    0.38
    icate
    0.38
     our
    0.38
    |_
    0.38
     вас
    0.38
    <0x80>
    0.37
    igi
    0.37
    POSITIVE LOGITS
     aggi
    0.49
     aprovech
    0.44
     aggiunto
    0.44
     thêm
    0.44
     supplémentaire
    0.44
     fueron
    0.44
     simplesmente
    0.43
     fortuit
    0.43
     extraneous
    0.43
     aggiungere
    0.42
    Act Density 0.032%

    No Known Activations