INDEX
    Explanations

    significant quality or quantity

    New Auto-Interp
    Negative Logits
    ors
    0.94
    0.91
     Surgeons
    0.82
    ä
    0.77
     Inspectors
    0.75
     Stiffness
    0.74
     आपल्याला
    0.73
     uprisings
    0.72
    ましょう
    0.69
     Бу
    0.69
    POSITIVE LOGITS
    ו
    1.23
    ва
    1.16
    و
    1.09
    ר
    1.01
    ت
    0.98
    де
    0.97
    ar
    0.96
    т
    0.91
    the
    0.90
    an
    0.89
    Act Density 0.051%

    No Known Activations