INDEX
Explanations
quotations in text
quotation marks and phrases within them
New Auto-Interp
Negative Logits
Fas
-0.79
cite
-0.78
Versus
-0.77
XL
-0.76
cf
-0.76
ABE
-0.75
tackle
-0.75
deductions
-0.74
BD
-0.73
chant
-0.72
POSITIVE LOGITS
absolutely
1.88
extremely
1.87
completely
1.81
very
1.75
pretty
1.72
highly
1.72
quite
1.71
too
1.67
really
1.66
probably
1.65
Activations Density 0.083%