INDEX
Explanations
phrases related to agreement or approval
statements related to opinions or personal sentiments
New Auto-Interp
Negative Logits
¶ħ
-0.59
bryce
-0.52
ected
-0.50
blogspot
-0.50
brids
-0.48
Âł
-0.47
Previously
-0.47
selling
-0.46
olitan
-0.46
chool
-0.45
POSITIVE LOGITS
depends
0.50
increments
0.48
oneself
0.47
behav
0.46
ain
0.44
Almighty
0.43
explan
0.43
TTC
0.43
finish
0.42
digest
0.42
Activations Density 2.003%