INDEX
    Explanations

    specific numbers, quantified measurements, and references to external or contextual elements

    New Auto-Interp
    Negative Logits
     addCriterion
    -0.20
    ariat
    -0.15
    aming
    -0.14
    oÅĻ
    -0.14
    otime
    -0.14
     ÐľÐ¾Ð¶
    -0.14
    км
    -0.14
    دد
    -0.14
    shaw
    -0.14
    orget
    -0.14
    POSITIVE LOGITS
    olan
    0.22
     å¯Į
    0.15
    avou
    0.15
     beef
    0.15
    uldu
    0.14
    513
    0.14
     inject
    0.14
    ose
    0.14
    .try
    0.13
     Stefan
    0.13
    Act Density 0.035%

    No Known Activations