INDEX
Explanations
phrases indicating purpose or function
New Auto-Interp
Negative Logits
isi
-0.81
ategory
-0.79
chan
-0.75
akings
-0.75
anamo
-0.75
oder
-0.75
RIPT
-0.71
etta
-0.71
azo
-0.71
HCR
-0.71
POSITIVE LOGITS
coordinated
0.78
unanimous
0.75
accurate
0.70
contraction
0.68
careful
0.68
graceful
0.67
upward
0.67
rapid
0.67
orderly
0.67
downward
0.66
Activations Density 0.107%