INDEX
Explanations
prepositional phrases beginning with "of"
New Auto-Interp
Negative Logits
soon
-0.74
lass
-0.71
aren
-0.70
artisan
-0.68
igent
-0.68
most
-0.67
ancer
-0.67
meric
-0.66
Accessory
-0.65
iple
-0.65
POSITIVE LOGITS
relying
1.07
wasting
1.04
focusing
1.02
letting
1.01
blindly
0.97
bothering
0.95
having
0.94
being
0.93
complying
0.92
blaming
0.91
Activations Density 0.030%