INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ujednoznacz
    -1.05
    SharedCtor
    -0.95
     ModelExpression
    -0.92
    verwijspagina
    -0.86
     increí
    -0.85
    <unused43>
    -0.85
    <unused3>
    -0.85
    RenderAtEndOf
    -0.85
    <unused41>
    -0.85
    [@BOS@]
    -0.84
    POSITIVE LOGITS
    ↵↵
    0.49
    0.48
    '
    0.42
    0.34
    <eos>
    0.33
    ado
    0.33
      
    0.31
    <bos>
    0.31
    ps
    0.31
     Mal
    0.30
    Act Density 0.006%

    No Known Activations