INDEX
Explanations
descriptive quantities and characteristics related to objects and environments
New Auto-Interp
Negative Logits
escription
-0.70
akings
-0.66
PDATE
-0.66
icter
-0.66
Reviewer
-0.65
acea
-0.65
notwithstanding
-0.64
undown
-0.63
OUP
-0.63
replacements
-0.62
POSITIVE LOGITS
pedest
1.00
bushes
0.90
stretched
0.89
unsuspecting
0.86
oak
0.86
rooft
0.85
windshield
0.85
holes
0.84
mound
0.83
skysc
0.83
Activations Density 0.332%