INDEX
Explanations
phrases indicating a specific event or action occurring at a particular time or location
instances of the word "when" indicating temporal context in situations
New Auto-Interp
Negative Logits
hack
-0.68
ha
-0.65
Grade
-0.65
hi
-0.64
less
-0.62
ictive
-0.62
nig
-0.62
Fine
-0.61
cut
-0.60
agin
-0.58
POSITIVE LOGITS
soever
0.90
*/(
0.90
confronted
0.73
encountering
0.72
they
0.72
irlf
0.72
":[{"0.70
asked
0.70
undergoing
0.68
someone
0.66
Activations Density 0.083%