INDEX
Explanations
references to records or notable achievements
New Auto-Interp
Negative Logits
isan
-0.17
endor
-0.16
imoto
-0.16
izzato
-0.15
itzer
-0.15
840
-0.15
apon
-0.15
921
-0.15
iso
-0.14
tridge
-0.14
POSITIVE LOGITS
-breaking
0.31
keeping
0.29
-setting
0.24
breaking
0.23
ings
0.23
ant
0.19
breaking
0.19
keeper
0.19
breaker
0.19
setters
0.18
Activations Density 0.028%