INDEX
Explanations
references to gyms and fitness-related activities
New Auto-Interp
Negative Logits
EndInit
-0.55
ו
-0.52
ostavi
-0.51
通販
-0.50
цездатний
-0.49
stray
-0.48
avanzada
-0.47
Goi
-0.46
juges
-0.45
рованных
-0.45
POSITIVE LOGITS
gym
2.40
Gym
2.17
gyms
2.09
fitness
1.99
Fitness
1.95
gym
1.94
Gym
1.90
Fitness
1.81
fitness
1.74
gymnasium
1.66
Activations Density 0.045%