INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ſind
    -0.71
    ?");
    -0.59
    \"");
    -0.59
    )");
    
    -0.58
    tvguidetime
    -0.57
    }");
    -0.57
    RunWith
    -0.57
    +};
    -0.56
    )');
    -0.56
    ]};
    -0.56
    POSITIVE LOGITS
     italic
    2.30
    italic
    2.02
     italics
    1.61
    Italic
    1.57
     Ital
    1.36
    Ital
    1.14
     ital
    0.93
     ITAL
    0.77
    ital
    0.76
    ITAL
    0.68
    Act Density 0.005%

    No Known Activations