INDEX
Explanations
sentences that convey varying levels of confidence
New Auto-Interp
Negative Logits
ardon
-0.16
ubu
-0.16
Configurer
-0.15
WaitForSeconds
-0.15
essler
-0.14
Mahon
-0.14
linger
-0.14
伦
-0.14
vester
-0.14
ependency
-0.14
POSITIVE LOGITS
/conf
0.20
confidence
0.17
Confidence
0.17
ki
0.17
wart
0.16
Ki
0.15
Ki
0.15
assured
0.14
192
0.14
nt
0.14
Activations Density 0.011%