INDEX
Explanations
phrases related to dimensions or measurements
New Auto-Interp
Negative Logits
foot
-0.16
dol
-0.15
etat
-0.15
er
-0.15
kus
-0.15
mars
-0.15
Affero
-0.15
ehler
-0.15
heet
-0.14
vip
-0.14
POSITIVE LOGITS
wise
0.37
ened
0.37
ening
0.36
iness
0.26
wise
0.25
iest
0.25
-wise
0.24
ier
0.23
wis
0.22
ily
0.21
Activations Density 0.071%