INDEX
Explanations
the word "taken" and its variations, indicating references to actions or situations involving something being taken or removed
New Auto-Interp
Negative Logits
gave
-1.14
saw
-1.01
took
-1.00
did
-0.94
was
-0.87
threw
-0.77
gave
-0.76
went
-0.75
came
-0.74
showed
-0.71
POSITIVE LOGITS
taken
1.63
taken
1.60
flown
1.59
gone
1.58
Taken
1.54
arisen
1.51
Taken
1.48
gone
1.47
spoken
1.46
Seen
1.45
Activations Density 0.211%