INDEX
Explanations
terms related to political and social issues, particularly those affecting specific groups or regions
New Auto-Interp
Negative Logits
Dud
-0.15
ç²ī
-0.14
.documents
-0.13
usu
-0.13
151
-0.13
bedo
-0.13
æī¶
-0.12
cancellationToken
-0.12
IBLE
-0.12
989
-0.12
POSITIVE LOGITS
ur
0.91
UR
0.80
ur
0.77
Ur
0.73
Ur
0.69
UR
0.68
ÑĥÑĢ
0.63
_ur
0.63
.ur
0.61
urs
0.60
Activations Density 0.267%