INDEX
Explanations
anger or difficult emotions
New Auto-Interp
Negative Logits
ी
0.43
ipers
0.40
Mankind
0.38
xs
0.37
eh
0.37
Mosley
0.37
pb
0.37
ehm
0.37
сигна
0.36
composta
0.36
POSITIVE LOGITS
Barrier
0.39
ometown
0.37
änd
0.37
напа
0.37
лыгы
0.37
tathapi
0.36
ätta
0.35
raint
0.35
義務
0.35
ruct
0.34
Activations Density 0.013%