INDEX
Explanations
references to gender attitudes and equality in society
New Auto-Interp
Negative Logits
TestingModule
-0.94
Efq
-0.87
myſelf
-0.83
ſever
-0.83
Reſ
-0.82
Anſ
-0.81
purpoſe
-0.80
\{\\-0.79
GEBURTSDATUM
-0.79
kasarigan
-0.78
POSITIVE LOGITS
dotyczą
0.72
topics
0.72
titled
0.69
topic
0.67
tentang
0.67
about
0.64
regarding
0.63
关于
0.62
"
0.61
Topic
0.61
Activations Density 0.754%