INDEX
Explanations
references to specific items or projects related to Israel
New Auto-Interp
Negative Logits
antz
-0.15
ÑĥÑĩ
-0.15
iek
-0.14
eded
-0.14
errick
-0.14
rok
-0.14
rips
-0.14
ropa
-0.13
ropic
-0.13
ÑĥÑħ
-0.13
POSITIVE LOGITS
deaux
0.18
оÑĢони
0.14
Ashe
0.14
اÙĤ
0.14
Altern
0.14
Altern
0.14
altern
0.14
cad
0.14
Morg
0.14
Alta
0.13
Activations Density 0.050%