INDEX
Explanations
phrases related to conveying positive information or updates
phrases indicating good news
New Auto-Interp
Negative Logits
senal
-0.83
aval
-0.73
urat
-0.71
occup
-0.71
arij
-0.69
trained
-0.68
solvent
-0.67
ONSORED
-0.67
sediment
-0.67
apons
-0.63
POSITIVE LOGITS
worthiness
0.88
worthy
0.87
night
0.75
Catholic
0.74
headline
0.74
reader
0.74
headlines
0.73
peak
0.73
mith
0.73
NEWS
0.71
Activations Density 0.033%