INDEX
Explanations
references to the terrorist group "IS" or "ISIS"
references to the Islamic State (IS)
New Auto-Interp
Negative Logits
issance
-0.87
Trace
-0.78
Reviewer
-0.72
Brach
-0.69
Pry
-0.69
Seym
-0.68
ruciating
-0.67
Hundred
-0.66
Hearth
-0.65
rities
-0.64
POSITIVE LOGITS
SU
1.04
FP
1.02
Os
0.97
SI
0.94
BN
0.94
AF
0.91
OD
0.91
eq
0.90
PUT
0.86
AB
0.84
Activations Density 0.014%