INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <unused1023>
    0.55
     од
    0.53
     پن
    0.53
    OUIS
    0.51
     عر
    0.51
     испыта
    0.50
    0.50
    <unused968>
    0.48
     لوبوي
    0.48
    <unused506>
    0.48
    POSITIVE LOGITS
    3
    0.53
    1
    0.51
    я
    0.50
     system
    0.46
     net
    0.45
    mes
    0.44
    0
    0.44
     switch
    0.44
     response
    0.43
     media
    0.43
    Act Density 0.000%

    No Known Activations