INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ،
    0.69
    0.61
    ,《
    0.50
    idać
    0.47
    avez
    0.44
    ;
    0.44
    0.43
     ۽
    0.42
     Auswirkungen
    0.42
    0.41
    POSITIVE LOGITS
     className
    0.78
     class
    0.68
     measured
    0.52
     classe
    0.52
     belong
    0.52
     classed
    0.50
     belonging
    0.49
     belongs
    0.49
     belonged
    0.47
     clase
    0.47
    Act Density 0.011%

    No Known Activations