INDEX
    Explanations

    electrical charge properties

    New Auto-Interp
    Negative Logits
    ید
    1.13
    ۔
    1.05
    ಬ್ಬಿಣ
    1.03
    ت
    1.03
    מ
    1.02
    もら
    1.00
    م
    0.98
    ಿವೆ
    0.97
    یک
    0.95
     अभिने
    0.95
    POSITIVE LOGITS
     that
    1.27
    that
    1.17
    ation
    0.93
    on
    0.91
    я
    0.87
    ,
    0.87
    That
    0.82
    0.81
    :
    0.75
    ak
    0.75
    Act Density 0.003%

    No Known Activations