INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    elo
    -0.52
    Kna
    -0.51
    angular
    -0.47
    zo
    -0.47
    cur
    -0.47
     Центра
    -0.46
    tat
    -0.46
    leon
    -0.45
    FullName
    -0.45
    XYZ
    -0.45
    POSITIVE LOGITS
     if
    1.02
     If
    0.95
    If
    0.92
    Wenn
    0.91
    if
    0.91
     eğer
    0.90
     Wenn
    0.89
     gdyby
    0.88
     jika
    0.85
     Jika
    0.85
    Act Density 0.258%

    No Known Activations