INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ږئ
    0.52
    শংস
    0.50
     Rajya
    0.49
     bun
    0.48
    いました
    0.47
    )^\
    0.46
    ContextHeader
    0.46
     הה
    0.45
     أيضاً
    0.45
    لي
    0.44
    POSITIVE LOGITS
     surface
    1.06
    Surface
    0.98
     Surface
    0.95
     surfaces
    0.92
    surface
    0.87
     поверхности
    0.82
     SURFACE
    0.81
     povr
    0.80
    flächen
    0.79
     सरफेस
    0.79
    Act Density 0.153%

    No Known Activations