INDEX
    Explanations

    instances of the word "for" and its variations, indicating a focus on reasons or justifications in various contexts

    New Auto-Interp
    Negative Logits
     Rider
    -0.16
    пов
    -0.15
    onth
    -0.14
    ascus
    -0.13
    erializer
    -0.13
     rider
    -0.13
    emoc
    -0.13
    anda
    -0.13
    atron
    -0.13
    izo
    -0.13
    POSITIVE LOGITS
    -Allow
    0.17
    ÑĤÑİ
    0.16
    oved
    0.15
    è£ķ
    0.15
    agers
    0.15
     lesson
    0.14
    <decltype
    0.14
    AllWindows
    0.14
    asaki
    0.13
     DeepCopy
    0.13
    Act Density 0.077%

    No Known Activations