INDEX
Explanations
phrases indicating examples or instances of something
New Auto-Interp
Negative Logits
antage
-0.70
ribution
-0.68
ombat
-0.67
emi
-0.66
ushima
-0.65
dollar
-0.63
rition
-0.61
ogun
-0.61
olitical
-0.61
emale
-0.60
POSITIVE LOGITS
ties
0.71
cond
0.66
worldly
0.63
minded
0.60
Osw
0.60
embodiments
0.59
things
0.57
namely
0.56
EntityItem
0.55
odon
0.54
Activations Density 4.344%