INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0
    0.76
    5
    0.69
    6
    0.66
    AP
    0.64
    OP
    0.62
    4
    0.61
    9
    0.61
    ER
    0.61
    ۰
    0.61
    ET
    0.59
    POSITIVE LOGITS
    parts
    1.00
     parts
    0.97
    teile
    0.86
    部件
    0.84
    ীদার
    0.83
    Parts
    0.80
     Parts
    0.80
    零件
    0.80
     부분이
    0.78
     Teile
    0.76
    Act Density 0.105%

    No Known Activations