INDEX
Explanations
proper nouns
numerical values and references to specific years or dates
New Auto-Interp
Negative Logits
neighb
-0.77
derby
-0.75
swall
-0.74
dubbed
-0.71
edges
-0.71
redes
-0.70
relegation
-0.70
tram
-0.68
betting
-0.68
bundled
-0.68
POSITIVE LOGITS
"â̦
1.75
"[
1.69
"...
1.67
"'
1.63
Dear
1.61
Quote
1.56
"
1.45
"(
1.42
When
1.39
Hello
1.38
Activations Density 0.107%