INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Życiorys
    -0.52
    ibus
    -0.50
     schools
    -0.50
    єра
    -0.50
    MessageTagHelper
    -0.49
    __':
    
    -0.48
     Teach
    -0.48
    ieties
    -0.47
     Negoti
    -0.47
    errorHandler
    -0.47
    POSITIVE LOGITS
    
    0.63
     autorytatywna
    0.61
     drive
    0.57
    drive
    0.56
    Espèce
    0.55
    Captor
    0.54
    стоин
    0.54
     bonté
    0.53
    ArgsConstructor
    0.52
     vettor
    0.51
    Act Density 0.002%

    No Known Activations