INDEX
Explanations
instances of someone openly expressing something
expressions of openness regarding personal identity or beliefs
New Auto-Interp
Negative Logits
ĸļ
-0.76
batch
-0.75
Normandy
-0.68
nesota
-0.68
Expedition
-0.67
Millennium
-0.66
Rover
-0.65
Tune
-0.65
ifully
-0.65
ENCY
-0.65
POSITIVE LOGITS
defied
1.01
hostile
0.96
contradicted
0.96
contradict
0.95
defy
0.89
contradicts
0.88
bisexual
0.86
antagon
0.83
admit
0.82
admitting
0.81
Activations Density 0.041%