INDEX
Explanations
verbs related to actions, decisions, and intentions
verbs that express intention or expectation
New Auto-Interp
Negative Logits
Fail
-0.71
folk
-0.70
aminer
-0.68
eers
-0.64
Eva
-0.63
!),
-0.62
foothold
-0.62
)!
-0.62
!).
-0.61
conveniently
-0.61
POSITIVE LOGITS
"#
0.91
personally
0.86
"'
0.83
"
0.75
"[
0.75
poke
0.75
Paddock
0.73
resign
0.72
himself
0.71
constituents
0.69
Activations Density 0.618%