INDEX
Explanations
mentions of places or entities with the substring "amm"
New Auto-Interp
Negative Logits
nces
-0.69
Walk
-0.66
threat
-0.62
pole
-0.62
walk
-0.62
tilt
-0.61
grade
-0.60
meal
-0.60
Intent
-0.59
selective
-0.59
POSITIVE LOGITS
olit
1.03
iversary
0.97
erer
0.96
ock
0.94
arella
0.94
oths
0.93
igrants
0.93
uth
0.89
unity
0.88
osexual
0.88
Activations Density 0.025%