INDEX
Explanations
words related to protection or safeguarding
terms related to protection and safety
New Auto-Interp
Negative Logits
ãĥ£
-0.80
bold
-0.68
RM
-0.67
mers
-0.64
umo
-0.62
sonian
-0.62
LINE
-0.61
hler
-0.61
leaf
-0.60
rys
-0.59
POSITIVE LOGITS
afforded
0.93
ively
0.92
iveness
0.84
orship
0.81
folios
0.80
racket
0.79
ously
0.77
dogs
0.76
aments
0.75
atively
0.75
Activations Density 0.039%