INDEX
Explanations
phrases expressing strong beliefs or opinions
expressions of strong belief or endorsement
New Auto-Interp
Negative Logits
Settlement
-0.74
eon
-0.74
OTOS
-0.72
raltar
-0.72
Chaser
-0.72
oleon
-0.72
Procedure
-0.72
Tycoon
-0.71
Journals
-0.71
Corpse
-0.71
POSITIVE LOGITS
enough
0.92
disagree
0.78
differentiated
0.76
correlated
0.76
ener
0.75
encouraged
0.75
incentiv
0.75
advise
0.75
discouraged
0.74
charged
0.74
Activations Density 0.011%