INDEX
Explanations
statements of importance or emphasis
phrases indicating obligation or necessity
New Auto-Interp
Negative Logits
anth
-0.71
ullivan
-0.68
avi
-0.63
NetMessage
-0.62
lip
-0.61
iliate
-0.59
olson
-0.59
ãĤ¼ãĤ¦ãĤ¹
-0.58
chrome
-0.56
opolis
-0.56
POSITIVE LOGITS
applauded
1.20
ashamed
1.13
avoided
1.13
congratulated
0.99
recons
0.99
fitting
0.94
thanking
0.91
regarded
0.90
abolished
0.89
viewed
0.88
Activations Density 0.114%