INDEX
Explanations
phrases related to social progress and the functioning of society
concepts related to societal challenges and the importance of collective action for change
New Auto-Interp
Negative Logits
assorted
-0.74
catentry
-0.73
Blades
-0.61
nods
-0.61
alis
-0.60
respectively
-0.59
Bits
-0.59
considerable
-0.59
various
-0.58
periodically
-0.57
POSITIVE LOGITS
anymore
1.50
unless
1.39
without
1.24
unless
1.17
nor
1.15
WITHOUT
1.12
without
1.10
indefinitely
0.94
Without
0.92
except
0.89
Activations Density 0.570%