INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sério
    -0.09
     severe
    -0.08
     Severe
    -0.08
     anytime
    -0.08
     pension
    -0.08
     sérieux
    -0.08
     politely
    -0.08
     seriousness
    -0.07
     tranquill
    -0.07
     severely
    -0.07
    POSITIVE LOGITS
    (ct
    0.09
     {↵↵
    0.08
     sqrt
    0.08
    0.08
    0.08
     있는데
    0.08
     상승
    0.07
    ിക്കൽ
    0.07
    0.07
     Filho
    0.07
    Act Density 0.012%

    No Known Activations