INDEX
Explanations
the words that are strongly emphasized or endorsed
expressions of strong belief or endorsement
New Auto-Interp
Negative Logits
oleon
-0.76
Settlement
-0.76
Victims
-0.74
Procedure
-0.73
Thumbnails
-0.69
illary
-0.68
arters
-0.68
Chaser
-0.68
Journals
-0.67
Millennium
-0.67
POSITIVE LOGITS
enough
0.88
recommend
0.87
discouraged
0.84
disagree
0.83
encouraged
0.81
advised
0.81
advise
0.81
correlated
0.79
discour
0.79
appreciated
0.78
Activations Density 0.014%