INDEX
Explanations
expressions of disappointment and questioning statements
Follows commas or periods
disagreement or rejection
New Auto-Interp
Negative Logits
<bos>
-0.65
ніципа
-0.54
defaultstate
-0.48
aspectj
-0.46
spô
-0.45
altında
-0.44
způ
-0.40
conexao
-0.40
długość
-0.40
restes
-0.40
POSITIVE LOGITS
WRONG
1.19
Nope
1.09
wrong
1.08
Wrong
1.07
WRONG
1.05
wrong
1.02
nope
0.99
nope
0.98
Nope
0.96
Wrong
0.94
Activations Density 0.206%