INDEX
Explanations
comparisons between different concepts or entities
phrases related to societal structures and distinctions
New Auto-Interp
Negative Logits
Balt
-0.64
abouts
-0.63
cedented
-0.62
Moscow
-0.60
accompanied
-0.59
abled
-0.59
âĹ¼
-0.59
Kazakh
-0.59
UPDATE
-0.59
"}
-0.58
POSITIVE LOGITS
strive
0.92
nurture
0.90
obedience
0.78
striving
0.77
selfish
0.77
cultivate
0.76
mentors
0.73
strives
0.73
nurt
0.73
coward
0.73
Activations Density 0.955%