INDEX
Explanations
phrases indicating sources or origins of information
New Auto-Interp
Negative Logits
from
-0.20
assisted
-0.17
next
-0.16
settlement
-0.16
once
-0.16
kla
-0.15
under
-0.15
guard
-0.14
ongs
-0.14
no
-0.14
POSITIVE LOGITS
niž
0.16
-parse
0.15
oksen
0.15
CID
0.15
OSC
0.15
IDEO
0.15
/inet
0.15
ksen
0.14
าà¸ĩ
0.14
741
0.14
Activations Density 0.039%