INDEX
Explanations
the phrase "at all" within sentences
phrases expressing negation or lack of something
New Auto-Interp
Negative Logits
LU
-0.70
jected
-0.67
ufact
-0.63
Chancellor
-0.62
utsche
-0.59
selling
-0.56
xit
-0.56
jun
-0.55
lance
-0.55
lict
-0.55
POSITIVE LOGITS
ocating
0.87
paws
0.72
expense
0.71
iances
0.71
except
0.70
seams
0.69
glance
0.67
hazards
0.65
proportions
0.65
kinds
0.63
Activations Density 0.025%