INDEX
    Explanations

    inversely proportional to square

    New Auto-Interp
    Negative Logits
    َ
    0.92
    एसआय
    0.84
    𝑭
    0.84
    🌜
    0.82
    ah
    0.81
    0.80
    𝑯
    0.79
     ɛ
    0.78
    надцать
    0.78
     variabili
    0.77
    POSITIVE LOGITS
    عت
    0.75
     don
    0.70
     sighted
    0.69
     khắp
    0.65
    טח
    0.65
    くまで
    0.64
     åt
    0.63
    чения
    0.61
     جه
    0.60
    さが
    0.60
    Act Density 0.005%

    No Known Activations