INDEX
Negative Logits
hä
0.55
skine
0.50
шкин
0.49
イレ
0.49
quila
0.48
appointees
0.48
動
0.48
breed
0.47
optics
0.47
restrained
0.47
POSITIVE LOGITS
LGBTQ
0.77
gay
0.68
suicide
0.68
LGBT
0.67
Suicide
0.66
Rainbow
0.65
homophobic
0.65
LGBT
0.62
rainbow
0.61
Trevor
0.61
Activations Density 0.039%