INDEX
Explanations
quoted phrases or citations in the text
New Auto-Interp
Negative Logits
RFC
-0.87
pg
-0.84
compartment
-0.80
cm
-0.73
MT
-0.71
checkpoint
-0.71
Gustav
-0.70
moot
-0.69
mm
-0.68
rack
-0.68
POSITIVE LOGITS
duty
1.14
matter
1.12
Definition
1.09
sweet
1.09
Agent
1.06
dark
1.04
shine
1.02
rules
1.02
good
1.01
Ru
1.01
Activations Density 0.045%