INDEX
Explanations
statistics or numerical data related to performance averages
phrases related to statistical averages or performance metrics
New Auto-Interp
Negative Logits
abet
-0.72
Nou
-0.70
ACTED
-0.70
Starship
-0.67
ighter
-0.66
Rey
-0.66
bis
-0.64
Drug
-0.64
raft
-0.64
gravity
-0.63
POSITIVE LOGITS
averages
1.11
averaging
0.98
averaged
0.95
icals
0.87
imates
0.87
avg
0.84
mble
0.80
ILCS
0.79
unbeliev
0.77
ically
0.77
Activations Density 0.008%