INDEX
Explanations
phrases related to taking action or support for a cause
New Auto-Interp
Negative Logits
Downloadha
-0.67
vend
-0.63
geries
-0.63
similarities
-0.62
ibaba
-0.61
vulner
-0.60
gery
-0.60
fung
-0.60
unden
-0.59
therape
-0.57
POSITIVE LOGITS
guiActive
0.70
Yourself
0.69
Observer
0.67
elle
0.60
bleacher
0.60
icer
0.59
leon
0.59
listener
0.59
uler
0.58
ãĤ¼ãĤ¦ãĤ¹
0.58
Activations Density 0.068%