INDEX
    Explanations

    variable names like D, B, A, E

    New Auto-Interp
    Negative Logits
     aaa
    1.05
     cnc
    1.04
     xii
    1.02
     xiii
    1.00
     erc
    1.00
     xiv
    0.98
     tds
    0.98
     xvii
    0.96
    🔃
    0.96
     xvi
    0.96
    POSITIVE LOGITS
    B
    2.06
    C
    2.02
     B
    2.02
    E
    1.99
     E
    1.99
     C
    1.96
    F
    1.94
    D
    1.94
    G
    1.92
     D
    1.92
    Act Density 0.613%

    No Known Activations