INDEX
Explanations
references to social dynamics and interactions, particularly in contexts of power and consumerism
New Auto-Interp
Negative Logits
réhen
-0.66
snippetHide
-0.65
]-->
-0.65
hiszen
-0.62
/>";
-0.62
]]
-0.62
}],
-0.62
mişti
-0.61
gewöhn
-0.57
vieles
-0.57
POSITIVE LOGITS
fucking
0.99
FUCKING
0.90
goddamn
0.88
͡°
0.88
fucking
0.87
fuckin
0.83
fuck
0.81
motherfucker
0.79
ಠ
0.78
Fucking
0.78
Activations Density 1.077%