INDEX
Explanations
phrases expressing frustration or criticism towards media and content quality
New Auto-Interp
Negative Logits
iggins
-0.16
bes
-0.15
env
-0.15
idle
-0.14
ift
-0.14
iž
-0.14
ult
-0.14
gens
-0.14
ifting
-0.14
ewis
-0.13
POSITIVE LOGITS
why
0.16
obi
0.15
ARSE
0.15
oir
0.14
ãĥ¼ãĥī
0.14
ÏĦζ
0.14
лÑĥÑĩ
0.14
usz
0.14
.future
0.13
دÙĨ
0.13
Activations Density 0.113%