INDEX
Explanations
specific mentions of "Part" followed by a numerical value in the text
New Auto-Interp
Negative Logits
culosis
-0.69
berus
-0.66
Predators
-0.59
stasy
-0.59
ceilings
-0.59
briefs
-0.58
cumbers
-0.57
iversal
-0.56
incinn
-0.55
gently
-0.55
POSITIVE LOGITS
ridge
1.13
aking
0.94
ridges
0.94
ners
0.93
meal
0.92
icularly
0.91
icular
0.91
ials
0.88
nered
0.86
ially
0.85
Activations Density 0.390%