INDEX
Explanations
phrases related to observation and commentary
phrases and expressions related to personal perception and experience
New Auto-Interp
Negative Logits
surprisingly
-0.74
®
-0.71
etheless
-0.69
osponsors
-0.62
arist
-0.61
utterstock
-0.58
ensis
-0.58
"...
-0.57
purportedly
-0.57
collect
-0.56
POSITIVE LOGITS
)."
1.10
.")
1.08
.'"
1.01
.""
0.93
."
0.92
!'"
0.91
â̦"
0.91
!"
0.87
'."
0.86
)"
0.82
Activations Density 1.119%