INDEX
Explanations
mentions of the word "protection"
references to the concept of protection in various contexts
New Auto-Interp
Negative Logits
bold
-0.70
ãĥ£
-0.70
mers
-0.68
nexus
-0.63
Kinn
-0.63
hig
-0.63
hler
-0.58
leaf
-0.58
earch
-0.58
sonian
-0.58
POSITIVE LOGITS
ively
1.00
iveness
0.97
afforded
0.95
ously
0.84
aments
0.80
folios
0.79
against
0.79
dogs
0.78
ective
0.77
atively
0.77
Activations Density 0.043%