INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     AK
    -0.08
    -0.08
    ak
    -0.07
     internazionale
    -0.07
    AK
    -0.07
     Trigger
    -0.07
    -0.07
    -0.07
    _PR
    -0.07
     Petrol
    -0.07
    POSITIVE LOGITS
     bhf
    0.09
     Slayer
    0.09
     білім
    0.09
    vendors
    0.08
     minced
    0.08
     opleiding
    0.08
     бид
    0.08
    expanded
    0.08
     ભગ
    0.08
     мини
    0.08
    Act Density 0.001%

    No Known Activations