INDEX
Explanations
phrases related to incidents, conflicts, or massive actions
phrases related to societal issues and cultural critique
New Auto-Interp
Negative Logits
SPONSORED
-0.67
vertising
-0.63
yna
-0.61
bledon
-0.61
rive
-0.60
mathemat
-0.59
dinand
-0.56
âĢij
-0.55
neighb
-0.55
Īè
-0.55
POSITIVE LOGITS
]
1.42
):
1.42
)
1.35
].
1.25
Âł
1.25
))
1.23
),
1.23
↵Âł
1.21
);
1.21
,"
1.19
Activations Density 0.600%