INDEX
Explanations
short informative sentences or phrases typically found at the beginning or end of text segments
phrases that introduce lists or breakdowns of information
New Auto-Interp
Negative Logits
ibles
-0.73
aukee
-0.70
Ended
-0.67
afterlife
-0.66
mattered
-0.66
underdog
-0.64
Soon
-0.64
Surprise
-0.61
Tokens
-0.61
Weird
-0.61
POSITIVE LOGITS
ographical
0.76
reproduce
0.73
excerpt
0.72
links
0.72
ption
0.72
excerpts
0.71
appl
0.70
xual
0.69
rame
0.68
amas
0.68
Activations Density 0.098%