INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .emf
    -0.07
    ويك
    -0.07
    Trou
    -0.07
    iction
    -0.06
    ect
    -0.06
    Transition
    -0.06
    .met
    -0.06
    _MET
    -0.06
     stra
    -0.06
    owntown
    -0.06
    POSITIVE LOGITS
     Blackburn
    0.07
    .confirm
    0.06
    _foreign
    0.06
    іж
    0.06
    Channels
    0.06
     convincing
    0.06
    (dirname
    0.06
     πολύ
    0.06
     awareness
    0.06
    Cluster
    0.06
    Act Density 0.025%

    No Known Activations