INDEX
Explanations
words related to security breaches
mentions of breaches or violations
New Auto-Interp
Negative Logits
uana
-0.88
redistributed
-0.68
lee
-0.67
zl
-0.67
stag
-0.65
minist
-0.65
ICA
-0.64
onga
-0.63
livest
-0.63
opped
-0.63
POSITIVE LOGITS
breaches
0.88
terness
0.81
breach
0.80
ingly
0.77
hold
0.76
hole
0.73
Breach
0.72
aucus
0.72
holes
0.70
ware
0.70
Activations Density 0.032%