INDEX
Explanations
questions about subjective judgment
New Auto-Interp
Negative Logits
역시
0.95
やはり
0.95
denominado
0.95
tekint
0.93
이러한
0.89
또한
0.88
どのような
0.88
Nous
0.85
utiliza
0.84
તેમજ
0.83
POSITIVE LOGITS
subconsciously
1.13
crappy
1.09
hating
1.05
unconsciously
1.03
objectively
1.02
mediocre
0.97
sucks
0.96
shitty
0.95
nagging
0.94
rationally
0.93
Activations Density 0.345%