INDEX
Explanations
expressions of excitement and positivity related to racing or competition
New Auto-Interp
Negative Logits
ury
-0.16
anka
-0.14
thur
-0.14
libs
-0.14
anlar
-0.14
tel
-0.14
abis
-0.14
cutting
-0.14
lett
-0.14
participant
-0.13
POSITIVE LOGITS
pole
0.21
poles
0.21
podium
0.18
ikk
0.17
ance
0.17
imin
0.16
bane
0.15
-*
0.14
Pole
0.14
Pod
0.14
Activations Density 0.035%