INDEX
Explanations
references to a "pool."
references to various types of pools
New Auto-Interp
Negative Logits
rians
-0.74
NESS
-0.72
rian
-0.71
eme
-0.70
DonaldTrump
-0.64
Realms
-0.64
orial
-0.64
sbm
-0.60
®
-0.59
Cel
-0.59
POSITIVE LOGITS
pool
1.11
pool
1.05
esville
1.03
Pool
0.96
pools
0.96
eries
0.89
regate
0.87
side
0.84
hare
0.80
erves
0.77
Activations Density 0.028%