INDEX
    Explanations

    instances of high numerical values

    Category classification

    New Auto-Interp
    Negative Logits
    ioutil
    -0.57
     saveiro
    -0.57
    MethodManager
    -0.57
     calendriers
    -0.56
     indígen
    -0.54
     ویکی‌پدی
    -0.54
     vlasy
    -0.52
    genodigd
    -0.51
     gehouden
    -0.51
     hilsen
    -0.51
    POSITIVE LOGITS
    ↵↵↵
    0.78
    ↵↵↵↵
    0.68
    ↵↵↵↵↵
    0.64
     Stahl
    0.56
    ↵↵↵↵↵↵↵
    0.55
    ↵↵↵↵↵↵↵↵
    0.54
    ↵↵↵↵↵↵↵↵↵
    0.53
    ↵↵↵↵↵↵
    0.52
     McCle
    0.50
    mstyle
    0.48
    Act Density 0.018%

    No Known Activations