INDEX
    Explanations

    Code variables

    New Auto-Interp
    Negative Logits
     fairness
    -0.07
     prevailed
    -0.07
    _pieces
    -0.06
     ERC
    -0.06
    <script
    -0.06
     Lie
    -0.06
     cuisine
    -0.06
    Ch
    -0.06
    endoza
    -0.06
     oppressed
    -0.06
    POSITIVE LOGITS
     custom
    0.07
    zim
    0.07
    805
    0.06
     freshmen
    0.06
    iatric
    0.06
     thumbs
    0.06
     کیل
    0.06
    +"]
    0.06
    answer
    0.06
    มนตร
    0.05
    Act Density 0.047%

    No Known Activations