INDEX
Explanations
mentions of swimming and related activities
references to swimming activities and places
New Auto-Interp
Negative Logits
pora
-0.69
eq
-0.66
oppers
-0.66
________________________
-0.65
misplaced
-0.64
Engel
-0.64
stuffing
-0.63
orate
-0.63
relevant
-0.63
________________________________________________________________
-0.62
POSITIVE LOGITS
swimming
1.04
pools
0.99
pool
0.91
swim
0.91
pool
0.89
baths
0.88
suits
0.86
estone
0.86
Swim
0.86
halla
0.85
Activations Density 0.008%