INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Dest
    -0.07
     että
    -0.06
     wf
    -0.06
     ->
    -0.06
     kou
    -0.06
    .async
    -0.06
     parametro
    -0.06
     amateur
    -0.06
     bbc
    -0.06
     meals
    -0.06
    POSITIVE LOGITS
    (NAME
    0.07
     Truy
    0.07
    possible
    0.06
    :;↵
    0.06
    .Tipo
    0.06
    0.06
    OfWork
    0.06
    ]):
    0.06
    .Evaluate
    0.06
    (attr
    0.06
    Act Density 0.005%

    No Known Activations