INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    iasi
    -0.18
     bulk
    -0.17
    lessly
    -0.16
    155
    -0.15
    shan
    -0.15
     Trojan
    -0.15
    erval
    -0.15
     mini
    -0.14
     Benchmark
    -0.14
    imb
    -0.14
    POSITIVE LOGITS
    .GetObject
    0.15
    opak
    0.15
    foy
    0.14
    ynos
    0.14
    ERGY
    0.14
    fir
    0.14
    ephy
    0.14
    宿
    0.14
    .cgi
    0.14
     Ginny
    0.13
    Act Density 0.055%

    No Known Activations