INDEX
Explanations
places or locations
locations related to venues and places where activities take place
New Auto-Interp
Negative Logits
tein
-0.93
CHAT
-0.89
potion
-0.73
XT
-0.69
Timer
-0.66
apego
-0.65
timer
-0.63
PUT
-0.63
otos
-0.62
kun
-0.61
POSITIVE LOGITS
hips
1.21
chool
1.15
nationwide
1.13
hops
1.13
hare
1.03
frequ
1.01
worldwide
0.99
paces
0.96
across
0.94
alike
0.93
Activations Density 0.327%