INDEX
Explanations
information related to rankings or positions in a list
rankings and statistics related to performance in various categories
New Auto-Interp
Negative Logits
tun
-0.62
rod
-0.61
interns
-0.59
lot
-0.58
WER
-0.57
matter
-0.57
Done
-0.57
CHAT
-0.56
rett
-0.56
Report
-0.56
POSITIVE LOGITS
ibaba
0.86
offensive
0.78
roads
0.71
overall
0.70
ordinate
0.68
clusively
0.68
order
0.68
Honolulu
0.67
Hawaii
0.67
clusions
0.65
Activations Density 0.092%