INDEX
Explanations
mentions of specific names or terms related to LGBTQ topics and vague legal language
followed by "g" in the middle
activity or ending in gy
New Auto-Interp
Negative Logits
nahilalakip
-0.68
ویکیپدی
-0.65
незавершена
-0.62
::::::::
-0.61
:✨
-0.61
PerformLayout
-0.60
Terraria
-0.59
[]:
-0.59
StoryboardSegue
-0.59
awaiter
-0.58
POSITIVE LOGITS
ging
0.78
ged
0.71
gs
0.70
gy
0.70
ggg
0.68
gers
0.62
gggg
0.61
gings
0.60
ges
0.60
GER
0.60
Activations Density 0.937%