INDEX
Explanations
numbers at the start of a line or phrase
numerical values and references to voting data or rankings
New Auto-Interp
Negative Logits
aten
-0.71
advertising
-0.70
Herz
-0.65
RG
-0.61
oros
-0.61
idden
-0.60
è»
-0.59
Tour
-0.59
strugg
-0.59
proxy
-0.59
POSITIVE LOGITS
uther
0.66
ucer
0.64
Andromeda
0.62
pee
0.61
pc
0.60
Cinnamon
0.60
ooks
0.58
nesia
0.57
Lawn
0.57
μ
0.57
Activations Density 0.265%