INDEX
    Explanations

    before/after specific verbs

    New Auto-Interp
    Negative Logits
    ор
    1.03
     appare
    1.03
     dems
    1.00
     immuno
    0.97
     aliqu
    0.95
    یر
    0.94
     état
    0.93
    0.93
    ️⃣
    0.91
     existent
    0.90
    POSITIVE LOGITS
    i
    0.95
    ి
    0.85
    <code>
    0.83
     
    0.82
     nejen
    0.78
    ACKNOWLEDGMENTS
    0.76
    ENT
    0.74
    PG
    0.72
     serta
    0.71
    UM
    0.71
    Act Density 0.000%

    No Known Activations