INDEX
Explanations
the word "dam" and words starting with "dam"
New Auto-Interp
Negative Logits
Mario
-0.57
راقي
-0.57
lecte
-0.56
PACE
-0.56
نوز
-0.56
HomeAsUp
-0.55
No
-0.55
gea
-0.55
aternary
-0.54
ukkah
-0.54
POSITIVE LOGITS
dams
1.70
dam
1.63
Dam
1.59
Dam
1.53
dam
1.44
Dams
1.40
DAM
1.33
dams
1.20
DAM
1.13
Damon
1.05
Activations Density 0.005%