INDEX
Explanations
terminology related to protection or being safeguarded
references to the concept of protection or safeguarding
New Auto-Interp
Negative Logits
izzle
-0.64
itch
-0.64
rum
-0.62
iem
-0.62
palate
-0.61
ISTORY
-0.61
iga
-0.61
ology
-0.61
gram
-0.61
alities
-0.60
POSITIVE LOGITS
protected
3.79
protected
2.91
unprotected
2.25
shielded
1.95
protection
1.73
protects
1.68
protecting
1.65
protect
1.65
guarded
1.61
safegu
1.60
Activations Density 0.021%