INDEX
Explanations
secure access and protection
New Auto-Interp
Negative Logits
injust
0.71
infringed
0.70
violations
0.68
গুরুতর
0.65
violated
0.65
unfortunately
0.64
気軽に
0.64
violation
0.64
casus
0.63
сожалению
0.62
POSITIVE LOGITS
shielded
1.11
impermeable
1.08
neutral
1.07
opaque
1.04
non
1.01
Neutral
0.99
inert
0.99
NON
0.98
locking
0.98
sealed
0.98
Activations Density 0.327%