INDEX
    Explanations

    mentioning specific details

    New Auto-Interp
    Negative Logits
     Question
    0.44
     Choose
    0.43
     Understanding
    0.42
     Abandon
    0.41
     Aboriginal
    0.39
     Answer
    0.38
     Choosing
    0.38
     stellte
    0.38
     answer
    0.38
     Determine
    0.37
    POSITIVE LOGITS
     언급
    0.76
     mention
    0.73
     mentions
    0.69
     mentioning
    0.69
    Mention
    0.67
    mention
    0.65
     जिक्र
    0.63
     mencion
    0.62
     menciona
    0.61
     упомина
    0.60
    Act Density 0.148%

    No Known Activations