INDEX
Explanations
specific page numbers and citations in a written text
New Auto-Interp
Negative Logits
settled
-0.67
Clicker
-0.67
Chow
-0.62
awoke
-0.62
Stronghold
-0.60
Ronaldo
-0.59
Ragnarok
-0.58
settlement
-0.58
overnight
-0.57
wiser
-0.57
POSITIVE LOGITS
seq
0.99
VII
0.94
xx
0.91
iii
0.87
139
0.87
409
0.86
xxx
0.85
371
0.85
412
0.83
149
0.82
Activations Density 0.068%