INDEX
Explanations
first presentation, time, person, program, step
New Auto-Interp
Negative Logits
'
0.41
iidae
0.40
inete
0.39
že
0.39
daha
0.39
bzw
0.38
\
0.38
neden
0.38
siis
0.38
④
0.38
POSITIVE LOGITS
first
0.62
first
0.61
primeira
0.54
responders
0.54
First
0.52
पहला
0.52
첫
0.52
Первая
0.51
이자
0.50
Pertama
0.50
Activations Density 0.077%