INDEX
Explanations
references to political oppression and social issues related to governance and identity
New Auto-Interp
Negative Logits
contentLoaded
-0.57
hip
-0.47
性
-0.41
ity
-0.41
ervant
-0.40
defaultstate
-0.39
typelib
-0.39
化
-0.39
localctx
-0.38
적인
-0.38
POSITIVE LOGITS
ities
1.20
ments
1.16
ths
1.07
nesses
1.07
ings
1.05
hips
1.02
ures
1.02
ches
1.02
aries
1.01
ages
1.00
Activations Density 1.617%