INDEX
Explanations
mentions of current events or recent trends
phrases indicating a fresh start or new opportunities
New Auto-Interp
Negative Logits
evolves
-0.61
antis
-0.60
progresses
-0.58
atories
-0.57
ulative
-0.57
erial
-0.57
lasting
-0.54
grows
-0.53
asymm
-0.53
UTC
-0.53
POSITIVE LOGITS
congratulations
0.74
probably
0.73
NOW
0.73
starter
0.70
rejoice
0.70
joy
0.70
your
0.69
RELE
0.68
prime
0.67
surely
0.67
Activations Density 0.223%