INDEX
Explanations
phrases and terms related to quantities and measurements
New Auto-Interp
Negative Logits
airs
-0.18
242
-0.17
uga
-0.15
DIS
-0.15
zo
-0.15
ken
-0.15
gra
-0.14
obar
-0.14
all
-0.14
amburg
-0.14
POSITIVE LOGITS
-thirds
0.19
nd
0.16
gether
0.16
/th
0.16
ispecies
0.15
athers
0.15
ernaut
0.15
inding
0.14
tandem
0.14
amber
0.14
Activations Density 0.498%