INDEX
Explanations
references to military attacks and their implications
New Auto-Interp
Negative Logits
ombre
-0.15
quil
-0.15
æijĨ
-0.14
rane
-0.14
:view
-0.14
éļİ
-0.14
dez
-0.13
Mehr
-0.13
cal
-0.13
stå
-0.13
POSITIVE LOGITS
targets
0.41
Targets
0.35
targets
0.33
target
0.30
Targets
0.28
target
0.27
缮æłĩ
0.26
Target
0.24
-target
0.24
infrastructure
0.24
Activations Density 0.142%