INDEX
Explanations
instances of prompts or transitions in discussions
New Auto-Interp
Negative Logits
minster
-0.86
amorph
-0.70
icone
-0.69
witnesses
-0.65
ModLoader
-0.65
uyomi
-0.65
uments
-0.64
aeus
-0.63
adoes
-0.63
alleged
-0.63
POSITIVE LOGITS
follow
0.64
cour
0.64
bnb
0.62
consider
0.60
league
0.59
optional
0.58
galitarian
0.57
[+
0.57
ital
0.57
econom
0.56
Activations Density 0.129%