INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     mirac
    -0.07
    ِب
    -0.07
    owl
    -0.06
    کور
    -0.06
    Mex
    -0.06
     planets
    -0.06
    مود
    -0.06
     اف
    -0.06
    _contin
    -0.06
    fixtures
    -0.06
    POSITIVE LOGITS
    ringe
    0.08
     batch
    0.08
     среди
    0.07
    ","",
    0.07
     dart
    0.07
     hart
    0.07
     sage
    0.06
     Jelly
    0.06
     immediately
    0.06
     Stem
    0.06
    Act Density 0.001%

    No Known Activations