INDEX
Explanations
references to significant disasters or violence
New Auto-Interp
Negative Logits
ąt
-0.47
efeller
-0.45
cleo
-0.45
كومونز
-0.45
bienvenue
-0.44
Exit
-0.44
assolutamente
-0.44
kalian
-0.43
consuls
-0.43
ترة
-0.42
POSITIVE LOGITS
wildfires
0.79
ravaged
0.78
AndEndTag
0.77
fires
0.76
fires
0.75
DAMAGE
0.72
wildfire
0.71
devastation
0.69
floods
0.69
IUrlHelper
0.69
Activations Density 0.425%