INDEX
Explanations
clear hierarchydata centerswestern culturesno loading
New Auto-Interp
Negative Logits
stared
0.59
stare
0.55
staring
0.52
smiled
0.49
EROS
0.49
is
0.48
for
0.48
isn
0.48
turner
0.48
then
0.48
POSITIVE LOGITS
وتن
0.48
价
0.46
zię
0.45
్రీ
0.43
傳統
0.42
Democratic
0.42
กร
0.42
Vik
0.42
prijs
0.41
),
0.41
Activations Density 0.004%