INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _topic
    -0.07
    Han
    -0.07
    رج
    -0.07
    ectors
    -0.07
     oma
    -0.07
     sufferers
    -0.06
     punishable
    -0.06
    $x
    -0.06
     energies
    -0.06
     ла
    -0.06
    POSITIVE LOGITS
     Trusted
    0.06
    blo
    0.06
    0.06
    Used
    0.06
     IBOutlet
    0.06
     love
    0.06
    cripts
    0.06
    books
    0.05
    installed
    0.05
     Elevated
    0.05
    Act Density 0.332%

    No Known Activations