INDEX
    Explanations

    explaining "equivalent"

    New Auto-Interp
    Negative Logits
    ил
    0.96
    ες
    0.90
    0.89
    </h3>
    0.89
     مُ
    0.88
    ING
    0.87
    0.87
    お金
    0.84
    ة
    0.83
     crucified
    0.83
    POSITIVE LOGITS
     equivalent
    1.30
     Equivalent
    1.08
    0
    1.05
     equivalente
    1.03
     equivalents
    1.02
    x
    0.98
    ah
    0.92
    on
    0.89
    )
    0.85
    Equivalent
    0.83
    Act Density 0.021%

    No Known Activations