INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bordered
    -0.09
    boot
    -0.08
     cén
    -0.08
    ding
    -0.08
     preaching
    -0.08
     الإسلامية
    -0.08
    provider
    -0.08
    slides
    -0.07
    bindings
    -0.07
    tips
    -0.07
    POSITIVE LOGITS
     viande
    0.08
     daughter
    0.08
     Processes
    0.08
     chlor
    0.08
     endl
    0.07
     Gesetz
    0.07
     clá
    0.07
     hrane
    0.07
    _WAIT
    0.07
     kinetics
    0.07
    Act Density 0.002%

    No Known Activations