INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -1.27
    -1.19
    -1.13
     ikke
    -1.07
     которые
    -1.04
    мера
    -1.01
     を
    -1.01
     которая
    -1.00
    łem
    -0.97
     sitä
    -0.97
    POSITIVE LOGITS
     has
    1.06
     have
    0.98
     will
    0.97
    ここの
    0.96
     accompanies
    0.96
    ができます
    0.96
     vielmehr
    0.95
     would
    0.94
     habrá
    0.93
     esterna
    0.93
    Act Density 0.004%

    No Known Activations