INDEX
Explanations
keywords related to political events and conflicts involving prominent figures or groups
New Auto-Interp
Negative Logits
olt
-0.15
oufl
-0.14
Bias
-0.14
trục
-0.14
azzi
-0.14
tera
-0.14
تاÙĨ
-0.13
ÑĢел
-0.13
rowse
-0.13
ula
-0.13
POSITIVE LOGITS
duk
0.35
squared
0.32
face
0.32
duke
0.31
spar
0.31
square
0.30
compete
0.30
faces
0.28
faced
0.27
engage
0.26
Activations Density 0.150%