INDEX
Explanations
phrases involving criticism or pressure from various sources
phrases indicating sources of criticism or pressure
New Auto-Interp
Negative Logits
puter
-0.82
imate
-0.74
pleted
-0.72
iven
-0.70
gru
-0.70
nex
-0.70
film
-0.69
é¾įåĸļ士
-0.68
merce
-0.68
nets
-0.68
POSITIVE LOGITS
afar
1.41
abroad
1.10
passers
0.98
constituents
0.97
inside
0.89
within
0.87
listeners
0.85
creditors
0.84
superiors
0.83
commenters
0.83
Activations Density 0.105%