INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     мень
    0.50
     menos
    0.49
     менее
    0.49
     moins
    0.47
     LESS
    0.47
     SAI
    0.46
     меньше
    0.45
    ursa
    0.44
     Less
    0.42
     lesser
    0.42
    POSITIVE LOGITS
     happen
    0.44
     occur
    0.42
     directions
    0.41
     nothing
    0.41
    ゥム
    0.40
     ocorrer
    0.38
     கொண்டார்
    0.38
     llegue
    0.37
     nimic
    0.37
    0.37
    Act Density 0.000%

    No Known Activations