INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    0
    -0.07
     Telecom
    -0.07
    _dims
    -0.07
    ideal
    -0.07
     który
    -0.07
    _watch
    -0.07
     всі
    -0.06
     kon
    -0.06
     surrounded
    -0.06
     outras
    -0.06
    POSITIVE LOGITS
     January
    0.08
    January
    0.08
     February
    0.07
     Initializing
    0.07
    ají
    0.07
    ATUS
    0.07
     min
    0.07
    ποιη
    0.07
     minib
    0.07
    0.06
    Act Density 0.028%

    No Known Activations