INDEX
Explanations
argument structure: consequent, sequence, fallacy
New Auto-Interp
Negative Logits
0.57
лимп
0.56
浈
0.53
嚗
0.53
颞
0.52
⦕
0.51
způ
0.50
notValid
0.50
resTmp
0.50
смартфо
0.49
POSITIVE LOGITS
S
0.63
.
0.56
Y
0.55
W
0.55
↵
0.54
a
0.52
data
0.51
N
0.50
Z
0.49
test
0.49
Activations Density 0.010%