INDEX
Explanations
concepts related to truth, legitimacy, and moral lessons
New Auto-Interp
Negative Logits
FY
-0.67
Verb
-0.62
oxide
-0.59
chy
-0.59
Chains
-0.59
stormed
-0.58
totaled
-0.57
Drunk
-0.57
catentry
-0.57
rive
-0.57
POSITIVE LOGITS
inherent
0.79
lurking
0.78
parallels
0.77
similarities
0.76
precedent
0.75
paralle
0.73
overlap
0.72
underlying
0.71
downside
0.70
hurst
0.69
Activations Density 0.093%