INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ++
    
    -0.62
    colar
    -0.60
     Axiom
    -0.57
     '',
    
    -0.56
     DED
    -0.56
    '>
    
    -0.56
    $',
    -0.54
     Acorn
    -0.54
    '},
    
    -0.54
     idea
    -0.54
    POSITIVE LOGITS
     respectively
    4.06
    respectively
    3.64
     respectivamente
    3.29
     respectivement
    2.84
     respective
    2.23
     соответственно
    2.00
     rispet
    1.85
    respective
    1.70
     respec
    1.68
    分别
    1.63
    Act Density 0.162%

    No Known Activations