INDEX
Explanations
phrases indicating recognition or acknowledgment of individuals, typically in a context of capability or status
New Auto-Interp
Negative Logits
SequentialGroup
-1.01
lèvres
-0.98
pouvoirs
-0.95
épaules
-0.86
négociations
-0.85
vœux
-0.85
démocr
-0.84
attentes
-0.81
intérêts
-0.79
côtés
-0.79
POSITIVE LOGITS
themselves
1.08
themselves
0.91
những
0.72
are
0.67
ones
0.66
members
0.62
mga
0.60
veritable
0.59
examples
0.59
They
0.56
Activations Density 0.838%