INDEX
Explanations
terms related to physical characteristics or attributes of objects
New Auto-Interp
Negative Logits
Nadu
-0.75
ACTED
-0.70
icago
-0.69
ansen
-0.68
Cron
-0.66
scl
-0.65
overs
-0.65
Twins
-0.64
sie
-0.61
mammoth
-0.60
POSITIVE LOGITS
properties
0.93
properties
0.80
uation
0.79
ality
0.77
Changed
0.76
iveness
0.76
ious
0.76
istry
0.73
ually
0.72
ter
0.71
Activations Density 0.010%