INDEX
Explanations
references to broader or wider concepts, contexts, or communities
New Auto-Interp
Negative Logits
girls
-0.70
_-
-0.68
Lex
-0.67
boy
-0.67
liest
-0.65
visor
-0.65
girl
-0.64
Guard
-0.64
Drawn
-0.64
rol
-0.63
POSITIVE LOGITS
societal
0.94
spectrum
0.85
soType
0.83
than
0.79
reperto
0.79
context
0.78
ado
0.77
organis
0.74
circulation
0.73
ensemble
0.71
Activations Density 5.177%