INDEX
Explanations
phrases or sentences related to actions being done
instances of the word "to," indicating a focus on infinitive verbs or actions
New Auto-Interp
Negative Logits
deductions
-0.68
deviations
-0.61
detached
-0.60
ranges
-0.60
transformations
-0.60
knots
-0.60
binaries
-0.59
crossings
-0.59
activated
-0.58
surgeon
-0.58
POSITIVE LOGITS
coincide
1.43
commemorate
1.42
celebrate
1.31
pless
1.22
promote
1.06
accompany
1.05
emphasize
1.03
remind
0.97
wered
0.96
illustrate
0.96
Activations Density 0.240%