INDEX
Negative Logits
eta
0.95
Delta
0.87
theta
0.86
delta
0.86
vec
0.84
equiv
0.84
beta
0.82
alpha
0.82
Math
0.81
save
0.80
POSITIVE LOGITS
misconceptions
0.77
worms
0.77
clarification
0.76
judiciary
0.76
"));
0.76
weeds
0.75
fuc
0.75
FAQs
0.75
wasteland
0.74
Literacy
0.74
Activations Density 0.011%