INDEX
Explanations
comparisons between different quantities, emphasizing increases or decreases
New Auto-Interp
Negative Logits
æ©
-0.79
borg
-0.71
boxing
-0.71
xtap
-0.67
Outbreak
-0.67
Classification
-0.66
Ń·
-0.64
\\\\\\\\
-0.63
atan
-0.63
����
-0.63
POSITIVE LOGITS
than
0.79
fortunate
0.79
desirable
0.79
mature
0.79
prevalent
0.78
importantly
0.78
realistic
0.78
inclined
0.75
frequent
0.74
adventurous
0.74
Activations Density 0.020%