INDEX
Explanations
mentions of "pool" in various contexts
New Auto-Interp
Negative Logits
assen
-0.18
noch
-0.16
habit
-0.15
appen
-0.15
bred
-0.14
uide
-0.14
adele
-0.14
545
-0.14
Stre
-0.14
BTN
-0.14
POSITIVE LOGITS
Giang
0.16
olarity
0.15
ãĤº
0.15
wind
0.14
iller
0.14
Better
0.14
kker
0.14
enactment
0.13
ifo
0.13
ilo
0.13
Activations Density 0.005%