INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ñón
    0.47
     ቀለም
    0.42
    ждествен
    0.41
    زال
    0.40
    ósfera
    0.40
    真正
    0.39
    waarden
    0.39
    rétaire
    0.39
    ปลอด
    0.38
    }:${
    0.37
    POSITIVE LOGITS
     diminishing
    1.70
     diminish
    1.07
     diminishes
    1.05
     Dim
    0.99
     diminished
    0.96
     marginal
    0.88
    Dim
    0.87
     dimin
    0.85
    dim
    0.82
     diminution
    0.80
    Act Density 0.005%

    No Known Activations