INDEX
Explanations
words related to annual sports events or competitions
New Auto-Interp
Negative Logits
odia
-0.15
ihan
-0.14
ibt
-0.14
Impl
-0.14
Hud
-0.14
TINGS
-0.13
impl
-0.13
_HIT
-0.13
Reid
-0.13
Fen
-0.13
POSITIVE LOGITS
baj
0.28
lab
0.27
torn
0.24
vers
0.23
Lab
0.22
sele
0.21
Lab
0.21
_lab
0.21
ed
0.20
vb
0.20
Activations Density 0.000%