INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    132
    -0.07
     hier
    -0.07
    Elevation
    -0.07
    helu
    -0.07
    pho
    -0.07
    utherford
    -0.07
     Rutherford
    -0.07
     cập
    -0.07
    Amplitude
    -0.07
     edific
    -0.07
    POSITIVE LOGITS
    0.08
     poured
    0.08
     MRT
    0.08
    dac
    0.07
     melted
    0.07
     الزيت
    0.07
    -drop
    0.07
     mula
    0.07
    gc
    0.07
     tern
    0.07
    Act Density 0.001%

    No Known Activations