INDEX
Explanations
themes related to conflict or war
New Auto-Interp
Negative Logits
CEL
-0.17
rai
-0.15
raq
-0.15
dangerous
-0.14
Bi
-0.14
Gas
-0.14
OWL
-0.14
.handlers
-0.14
adera
-0.14
raid
-0.14
POSITIVE LOGITS
icz
0.16
Ñĥки
0.15
None
0.15
лак
0.15
Siz
0.15
pyx
0.15
aths
0.15
ResultsController
0.14
emics
0.14
ImageButton
0.14
Activations Density 0.122%