INDEX
Explanations
geographical locations or affiliation
locations
New Auto-Interp
Negative Logits
help
-0.50
<bos>
-0.45
portál
-0.43
THREADS
-0.40
あとは
-0.40
號
-0.39
<eos>
-0.39
enschappelijke
-0.39
he
-0.39
He
-0.38
POSITIVE LOGITS
houſe
0.80
Theſe
0.79
Majefty
0.71
Monfieur
0.71
صوتيه
0.71
purpoſe
0.69
Houſe
0.69
itſelf
0.68
blockList
0.67
pleaſure
0.67
Activations Density 0.568%