INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    vf
    0.58
    fen
    0.55
    conse
    0.50
    ؛
    0.50
    fabs
    0.50
    estock
    0.49
    abee
    0.49
    af
    0.49
    z
    0.49
    vaj
    0.48
    POSITIVE LOGITS
     terrorism
    0.59
     редак
    0.52
     nessuna
    0.49
     Terrorism
    0.49
     pequena
    0.48
     maatau
    0.48
     NAND
    0.48
    0.48
     Veterin
    0.47
     piccola
    0.47
    Act Density 0.000%

    No Known Activations