INDEX
    Explanations

    technical compound phrases

    New Auto-Interp
    Negative Logits
    datasets
    0.97
    Magnitude
    0.94
    Де
    0.92
    У
    0.91
    Ха
    0.90
    Accordion
    0.89
    0.89
     GetSRP
    0.88
    Pais
    0.88
    <unused1453>
    0.88
    POSITIVE LOGITS
     (
    1.11
     fucking
    1.03
     freaking
    0.94
     The
    0.94
     [
    0.90
    0.81
     even
    0.80
     details
    0.78
     something
    0.77
     
    0.76
    Act Density 2.205%

    No Known Activations