INDEX
Explanations
phrases that indicate planning or actions to be taken
New Auto-Interp
Negative Logits
inclusive
-0.17
ãĥ¼ãĥģ
-0.16
енÑĤÑĥ
-0.15
ãĥ¼ãĤ¸
-0.15
continu
-0.15
ä¹ħä¹ħ
-0.15
leftright
-0.15
stay
-0.14
ersist
-0.14
inclusive
-0.14
POSITIVE LOGITS
ereg
0.18
introduced
0.15
Sing
0.15
seriously
0.15
reg
0.14
serious
0.14
Parr
0.14
stop
0.14
pity
0.14
interfering
0.14
Activations Density 0.047%