INDEX
Explanations
phrases related to determining or evaluating something
repeated references to "the" as a common structural element across different contexts
New Auto-Interp
Negative Logits
âĿ
-0.69
note
-0.68
frog
-0.67
tumblr
-0.66
Save
-0.66
hari
-0.66
NB
-0.65
OTS
-0.65
SPONSORED
-0.63
quished
-0.63
POSITIVE LOGITS
extent
1.59
effectiveness
1.37
amount
1.35
likelihood
1.34
severity
1.32
outcome
1.28
adequ
1.25
direction
1.25
availability
1.23
efficacy
1.21
Activations Density 0.390%