INDEX
Explanations
content ratings and classifications in media
New Auto-Interp
Negative Logits
dorf
-0.16
ARK
-0.16
arkan
-0.16
èĨľ
-0.15
isse
-0.15
pit
-0.14
ari
-0.13
èij
-0.13
cci
-0.13
antt
-0.13
POSITIVE LOGITS
rated
0.45
rating
0.44
PG
0.44
Rated
0.41
Rating
0.40
Rating
0.40
-rated
0.40
ratings
0.38
PG
0.38
rating
0.37
Activations Density 0.053%