INDEX
Negative Logits
}_{-}\0.75
}}_{0.75
suburban
0.71
}_{0.69
}_{0.68
𝒟
0.67
🔜
0.67
➤
0.67
inden
0.66
suburb
0.66
POSITIVE LOGITS
^
3.65
^
2.99
<sup>
2.90
^{2.60
^(
2.59
$^
2.41
^^
2.31
.^
2.29
^{2.27
$^{2.27
Activations Density 0.172%
}_{-}\}}_{suburban
}_{}_{𝒟
🔜
➤
inden
suburb
^
^
<sup>
^{^(
$^
^^
.^
^{$^{