INDEX
Explanations
instances of the word "When" indicating temporal contexts or conditions
New Auto-Interp
Negative Logits
them
-0.48
honom
-0.46
ujednoznacz
-0.40
him
-0.39
himſelf
-0.38
ihn
-0.37
Them
-0.36
../
-0.36
adalah
-0.35
them
-0.34
POSITIVE LOGITS
confronted
1.12
faced
1.09
compared
0.98
asked
0.97
they
0.94
we
0.88
dealing
0.87
considering
0.87
viewed
0.84
discussing
0.84
Activations Density 0.196%