INDEX
Explanations
statements where there is an indication or feeling conveyed
phrases that express a sense of conjecture or speculation
New Auto-Interp
Negative Logits
rouse
-0.91
otos
-0.84
venge
-0.79
orem
-0.73
atching
-0.73
ests
-0.73
izons
-0.72
andise
-0.71
iling
-0.71
umbn
-0.69
POSITIVE LOGITS
unfair
0.81
doubtful
0.74
ãĤ¨
0.74
unlikely
0.72
unclear
0.71
probable
0.71
ril
0.68
imperative
0.68
Chimera
0.67
abundantly
0.66
Activations Density 0.049%