INDEX
Explanations
verbs and phrases indicating expectations and comparisons
phrases related to expectations and perceived inadequacies
New Auto-Interp
Negative Logits
Gul
-0.65
sweat
-0.57
reluct
-0.56
psychology
-0.54
parachute
-0.54
letters
-0.53
Hik
-0.53
ãĥĻ
-0.52
ronic
-0.51
Roose
-0.50
POSITIVE LOGITS
anyways
0.93
anyway
0.90
!.
0.85
because
0.85
anymore
0.82
.--
0.78
;)
0.76
due
0.75
.
0.74
.#
0.74
Activations Density 0.146%