INDEX
Explanations
concepts related to prestige and achievement
New Auto-Interp
Negative Logits
397
-0.18
opoulos
-0.17
andest
-0.15
orn
-0.15
ilar
-0.15
della
-0.14
aq
-0.14
ourse
-0.14
leck
-0.14
377
-0.14
POSITIVE LOGITS
of
0.42
cá»§a
0.32
of
0.25
of
0.24
à¸Ĥà¸Ńà¸ĩ
0.22
ofs
0.21
.of
0.20
à¸Ĥà¸Ńà¸ĩ
0.18
à¹Įà¸Ĥà¸Ńà¸ĩ
0.18
á»§a
0.18
Activations Density 0.166%