INDEX
    Explanations

    the presence of the word "for" in various contexts

    New Auto-Interp
    Negative Logits
     Sith
    -0.17
     ar
    -0.16
    Template
    -0.16
     Template
    -0.15
    uario
    -0.15
     best
    -0.14
     Size
    -0.14
    ,
    -0.14
     ramp
    -0.14
    ajas
    -0.14
    POSITIVE LOGITS
    hodob
    0.17
    γά
    0.17
    Äįem
    0.16
    ynes
    0.15
    svp
    0.15
     اÙĦأس
    0.14
     Essen
    0.14
     поÑĢÑĥÑĪеннÑı
    0.14
    ược
    0.14
     обÑĭ
    0.14
    Act Density 0.158%

    No Known Activations