INDEX
Explanations
phrases indicating expectations or hopes
expressions of expectation and planning
New Auto-Interp
Negative Logits
understands
-0.68
ravings
-0.67
compares
-0.65
varies
-0.63
consumes
-0.61
Witness
-0.60
ounces
-0.60
interacts
-0.59
matters
-0.58
orio
-0.58
POSITIVE LOGITS
hoped
0.86
doom
0.75
earlier
0.70
ndra
0.68
planned
0.67
Manor
0.67
unsuccessfully
0.63
beforehand
0.63
bone
0.62
invincible
0.61
Activations Density 0.435%