INDEX
Explanations
mentions of healthcare-related issues and statistics
New Auto-Interp
Negative Logits
inkle
-0.19
incel
-0.17
ÙĦÙĬÙĩ
-0.15
/Instruction
-0.14
Ekim
-0.14
vet
-0.14
EO
-0.14
mắt
-0.14
tort
-0.14
ein
-0.14
POSITIVE LOGITS
HIV
0.54
AIDS
0.47
/AIDS
0.32
gay
0.27
Retro
0.24
viral
0.23
gay
0.21
retro
0.21
Gay
0.21
Gay
0.21
Activations Density 0.053%