INDEX
Negative Logits
July
0.42
general
0.41
↵↵
0.40
online
0.40
)
0.39
inter
0.38
]
0.38
domains
0.38
academic
0.37
cohort
0.37
POSITIVE LOGITS
Didn
0.57
Doesn
0.52
Wouldn
0.52
Does
0.52
Honestly
0.51
Seriously
0.50
doesn
0.49
Wouldn
0.49
Didn
0.48
when
0.48
Activations Density 0.015%