INDEX
Explanations
quotations marked with double quotation marks
New Auto-Interp
Negative Logits
accomp
-0.98
carrier
-0.81
agric
-0.77
viral
-0.77
adjud
-0.75
dubbed
-0.75
liner
-0.75
cleanup
-0.74
brim
-0.74
flared
-0.74
POSITIVE LOGITS
We
1.59
Nobody
1.59
There
1.58
Everybody
1.58
Whoever
1.57
I
1.57
Absolutely
1.57
If
1.55
It
1.53
Everyone
1.51
Activations Density 0.112%