INDEX
    Explanations

    instances of the word "to" and its various forms, suggesting a focus on infinitive verbs or directions

    New Auto-Interp
    Negative Logits
    Ñıг
    -0.17
    843
    -0.16
    \App
    -0.15
    ãĥ¼ãĥ³
    -0.15
    elters
    -0.15
    839
    -0.15
    218
    -0.15
    bew
    -0.15
    STANCE
    -0.14
    radient
    -0.14
    POSITIVE LOGITS
     sometimes
    0.16
    YN
    0.15
     maybe
    0.15
    idata
    0.15
    fund
    0.15
    ads
    0.14
     average
    0.14
    ogan
    0.14
    WF
    0.14
    Correction
    0.14
    Act Density 0.040%

    No Known Activations