INDEX
Explanations
phrases related to numerical values and relationships
terms related to classifications and categories, particularly those associated with structure and type
New Auto-Interp
Negative Logits
destro
-0.75
prom
-0.72
constellation
-0.70
disg
-0.69
tremend
-0.69
amac
-0.69
ayn
-0.68
regul
-0.68
cig
-0.68
istine
-0.64
POSITIVE LOGITS
Decay
1.13
Aging
0.94
Emails
0.85
Profile
0.83
Quote
0.82
wise
0.81
Space
0.80
Ey
0.80
Track
0.79
Shots
0.78
Activations Density 0.260%