INDEX
Explanations
phrases related to identity, faith, and religious themes
concepts related to gender identity and social issues surrounding it
New Auto-Interp
Negative Logits
ipl
-0.75
opio
-0.68
Hond
-0.63
ERG
-0.62
cop
-0.61
Administ
-0.60
sched
-0.56
SUM
-0.56
ilater
-0.56
cknowled
-0.56
POSITIVE LOGITS
aloud
0.94
isine
0.75
onstage
0.75
tattoo
0.73
while
0.73
prominently
0.72
blasp
0.71
punishable
0.70
instead
0.69
anymore
0.66
Activations Density 0.459%