INDEX
Explanations
references to societal issues and personal experiences related to prejudice and injustice
New Auto-Interp
Negative Logits
velle
-0.15
ë¬
-0.15
Disposition
-0.15
.Logf
-0.15
KNOWN
-0.15
oto
-0.14
iamo
-0.14
ulumi
-0.14
.scalablytyped
-0.14
.Startup
-0.14
POSITIVE LOGITS
ollah
0.19
involved
0.17
involve
0.15
McG
0.15
involvement
0.14
selections
0.14
formerly
0.14
ocratic
0.14
essay
0.14
aub
0.13
Activations Density 0.341%