INDEX
Explanations
phrases related to social and historical references, particularly regarding oppression and societal issues
New Auto-Interp
Negative Logits
uovo
-0.45
colspan
-0.42
varandra
-0.41
ftagPool
-0.38
bakom
-0.37
honom
-0.35
listos
-0.34
Rugg
-0.34
posibil
-0.33
่ง
-0.33
POSITIVE LOGITS
LookAnd
0.74
otomatig
0.71
:✨
0.65
Rüyada
0.63
surla
0.59
UnusedPrivate
0.55
Мексичка
0.55
UnsafeEnabled
0.54
msgTypes
0.54
endregion
0.53
Activations Density 1.121%