INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    iface
    -0.07
    }
    ↵
    ↵
    -0.07
    tridge
    -0.07
    }),↵
    -0.07
    _sp
    -0.07
    TRUE
    -0.07
    .complete
    -0.07
     ‎#
    -0.06
    .csv
    -0.06
    /topics
    -0.06
    POSITIVE LOGITS
     revelation
    0.07
     Ava
    0.07
     legality
    0.06
     Bened
    0.06
    ukarı
    0.06
    	child
    0.06
    ротив
    0.06
     تل
    0.06
     Katie
    0.05
     Davies
    0.05
    Act Density 0.005%

    No Known Activations