INDEX
    Explanations

    conjunctions and adverbs that emphasize or modify actions

    New Auto-Interp
    Negative Logits
    featureID
    -0.91
     للاسماء
    -0.77
     betweenstory
    -0.76
    IntoConstraints
    -0.72
    parsedMessage
    -0.72
     ModelExpression
    -0.72
     <=",
    -0.67
     CreateTagHelper
    -0.66
    IVEREF
    -0.66
    InitVars
    -0.66
    POSITIVE LOGITS
     simply
    0.95
     just
    0.91
     Просто
    0.90
    Просто
    0.87
    Simply
    0.86
    Just
    0.85
    simply
    0.84
     einfach
    0.83
     Simply
    0.82
    just
    0.82
    Act Density 0.151%

    No Known Activations