INDEX
Explanations
instances and references to specific individuals or groups, particularly in contexts related to influence and actions in cultural or societal discussions
New Auto-Interp
Negative Logits
λει
-0.16
uce
-0.15
roke
-0.15
uis
-0.15
rray
-0.15
istrovstvÃŃ
-0.14
erge
-0.14
ogui
-0.14
XF
-0.14
ÑĥÑĢÑĥ
-0.14
POSITIVE LOGITS
sein
0.20
iture
0.17
983
0.14
heure
0.14
etc
0.14
ANGUAGE
0.14
inn
0.14
USR
0.14
Wander
0.14
DateTime
0.14
Activations Density 0.136%