INDEX
Explanations
names or titles given to specific systems, projects, or groups
New Auto-Interp
Negative Logits
SPONSORED
-0.82
baseman
-0.72
Constructed
-0.71
horr
-0.63
reassure
-0.62
horrified
-0.62
enjoyed
-0.62
dstg
-0.61
Edited
-0.61
tempted
-0.61
POSITIVE LOGITS
"#
0.93
"
0.93
'
0.89
`
0.80
COP
0.80
Operation
0.79
``
0.78
Fancy
0.73
Excellence
0.71
RED
0.70
Activations Density 0.868%