INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.81
     vêtements
    -0.79
    Jx
    -0.78
     phối
    -0.77
    Nix
    -0.76
    hobo
    -0.73
    exhaust
    -0.73
    bonne
    -0.72
     dương
    -0.70
     råd
    -0.70
    POSITIVE LOGITS
     sos
    1.03
    sos
    1.02
    zi
    1.01
    Filter
    0.97
     coefficients
    0.96
     filter
    0.96
    filter
    0.94
    Coe
    0.94
     poles
    0.92
    FILTER
    0.90
    Act Density 0.024%

    No Known Activations