INDEX
Explanations
short phrases indicating instructions or conditions
phrases that reference the reader's identity or experience
New Auto-Interp
Negative Logits
"},
-0.84
NetMessage
-0.69
Amendments
-0.68
APP
-0.67
Eag
-0.65
ð
-0.64
rocal
-0.62
VERTISEMENT
-0.61
WOR
-0.60
Ham
-0.60
POSITIVE LOGITS
inclined
0.84
lucky
0.83
unsure
0.82
unlucky
0.73
undecided
0.72
yourself
0.71
interested
0.71
subscribed
0.69
calibr
0.69
wondering
0.68
Activations Density 0.138%