INDEX
Explanations
generating sexually explicit content
New Auto-Interp
Negative Logits
anything
0.94
crimes
0.81
anything
0.80
products
0.79
brands
0.77
goods
0.77
things
0.77
scandals
0.76
items
0.76
cancers
0.75
POSITIVE LOGITS
এবং
1.33
ಮತ್ತು
1.29
、
1.22
आणि
1.22
और
1.19
и
1.19
మరియు
1.19
һәм
1.19
અને
1.18
và
1.15
Activations Density 0.347%