INDEX
    Explanations

    type followed by category

    New Auto-Interp
    Negative Logits
     bottle
    1.05
     bottles
    0.98
    0.94
    רי
    0.90
     چیر
    0.89
     slats
    0.88
    ें
    0.86
     plastic
    0.86
    bottle
    0.85
    rv
    0.85
    POSITIVE LOGITS
     kelamin
    0.99
     substituting
    0.92
     mengisi
    0.92
    ocurrencies
    0.87
    Replacement
    0.86
    cript
    0.83
     memeriksa
    0.82
     singkat
    0.82
    0.82
     übernahm
    0.81
    Act Density 0.119%

    No Known Activations