INDEX
Explanations
references to online communities and subcultures, particularly associated with specific platforms and their rules
New Auto-Interp
Negative Logits
?!
-0.32
podium
-0.31
assertEqual
-0.30
gjenge
-0.29
SizedBox
-0.29
닙
-0.29
Errorf
-0.28
awesome
-0.28
paire
-0.28
récord
-0.28
POSITIVE LOGITS
anon
0.69
fag
0.65
IntoConstraints
0.64
Anon
0.63
nigger
0.63
faggot
0.63
Kek
0.62
Kek
0.62
Anon
0.60
kek
0.59
Activations Density 0.218%