INDEX
Explanations
words related to political figures and their actions
occurrences of the character "Ļ"
New Auto-Interp
Negative Logits
disadvant
-0.77
undown
-0.73
Participation
-0.70
adolesc
-0.70
tabloid
-0.70
Palestin
-0.70
Lawn
-0.65
Folk
-0.65
distractions
-0.65
decimal
-0.65
POSITIVE LOGITS
ï¸ı
0.97
mir
0.90
felt
0.88
sure
0.85
lean
0.84
strong
0.83
forth
0.82
else
0.81
Balt
0.81
mand
0.80
Activations Density 0.378%