INDEX
    Explanations

    reasoning and justification

    words and short phrases that signal reasoning, cause, or discourse-connective (explanatory/contrasting) structure.

    New Auto-Interp
    Negative Logits
     scandal
    -0.06
    	signal
    -0.06
     driven
    -0.06
     Charger
    -0.06
     الميلاد
    -0.06
     Sergio
    -0.06
     جزء
    -0.06
    ским
    -0.06
    .Resolve
    -0.06
     princes
    -0.06
    POSITIVE LOGITS
    군요
    0.07
    0.07
     ROS
    0.06
     речі
    0.06
     เท
    0.06
    đ
    0.06
    cljs
    0.06
    0.06
     je
    0.06
    ξεις
    0.06
    Act Density 0.125%

    No Known Activations