INDEX
Explanations
mentions of specific individuals in political contexts, likely associated with opinions or analysis
New Auto-Interp
Negative Logits
ukong
-0.79
Pony
-0.71
aimon
-0.70
amina
-0.70
sacrific
-0.69
ickets
-0.66
Samar
-0.66
Vaugh
-0.66
alys
-0.65
unci
-0.65
POSITIVE LOGITS
ï¸ı
1.16
ï¸
0.92
âĢ¢âĢ¢
0.81
hazard
0.80
Ther
0.78
ternity
0.78
ÃĽ
0.76
ecause
0.75
Lic
0.74
Eight
0.70
Activations Density 10.917%