INDEX
Explanations
numbers or numerical values
numerical values or statistics
New Auto-Interp
Negative Logits
ItemImage
-0.66
conduc
-0.61
Ü
-0.61
creen
-0.56
ability
-0.56
ercise
-0.56
cerning
-0.56
Spears
-0.55
cold
-0.55
indo
-0.54
POSITIVE LOGITS
xff
1.26
xc
1.01
xa
0.96
xd
0.96
xe
0.96
xb
0.91
x
0.90
603
0.87
644
0.82
uer
0.80
Activations Density 0.042%