INDEX
Explanations
abstract qualities and states
New Auto-Interp
Negative Logits
nhàng
0.69
۳
0.68
vraie
0.66
ಗೊತ್ತ
0.64
이날
0.63
県
0.63
ቦታ
0.62
ෆ
0.62
ljudi
0.61
٣
0.61
POSITIVE LOGITS
2
1.24
'
1.01
8
0.93
在
0.89
ig
0.86
in
0.84
↵
0.84
are
0.83
1
0.82
6
0.82
Activations Density 0.045%