INDEX
Negative Logits
NONE
0.35
Regardless
0.34
Eclipse
0.34
Sometimes
0.33
Reminder
0.33
regardless
0.32
Regardless
0.32
JER
0.32
MAR
0.31
Simply
0.31
POSITIVE LOGITS
necessarily
0.70
terribly
0.63
really
0.58
orious
0.57
glamorous
0.57
nécessairement
0.56
necessari
0.55
necessariamente
0.54
foolproof
0.53
necesariamente
0.53
Activations Density 0.045%