INDEX
    Explanations

    open weights and transparency

    New Auto-Interp
    Negative Logits
     utilizar
    0.39
     మూడు
    0.38
     മറ്റൊരു
    0.38
     delicate
    0.37
     nutzen
    0.36
     Calyce
    0.35
     Aloe
    0.35
     بط
    0.35
     elegante
    0.35
    >[
    0.35
    POSITIVE LOGITS
     transparency
    1.09
    Transparency
    1.04
     openly
    1.00
     Transparency
    0.99
    公开
    0.98
     transparence
    0.98
     transparencia
    0.95
     transparent
    0.93
     पारदर्शिता
    0.92
     공개
    0.91
    Act Density 0.355%

    No Known Activations