INDEX
Explanations
the word "all"
repetitive phrases indicating inclusivity or universality
New Auto-Interp
Negative Logits
abwe
-0.70
nowhere
-0.69
tremend
-0.67
rir
-0.66
SHIP
-0.66
Kamp
-0.62
chie
-0.61
flation
-0.61
hran
-0.60
Fidel
-0.58
POSITIVE LOGITS
ocating
1.14
igator
1.12
kinds
1.11
iances
1.06
sorts
1.02
igators
0.98
iance
0.97
ocation
0.95
uv
0.94
usions
0.94
Activations Density 0.124%