INDEX
Explanations
references to swimming and water-related activities
New Auto-Interp
Negative Logits
igne
-0.19
à¤Łà¤ķ
-0.15
eve
-0.15
ìĤ¬íķŃ
-0.15
eer
-0.15
óc
-0.15
ìĤ¬íķŃ
-0.14
IMA
-0.14
reet
-0.14
terra
-0.14
POSITIVE LOGITS
swimming
0.21
Swimming
0.16
/body
0.16
erb
0.15
erman
0.15
/sw
0.15
untu
0.14
isode
0.14
Sad
0.14
paddle
0.13
Activations Density 0.027%