INDEX
Explanations
words related to a specific designation or labeling system
New Auto-Interp
Negative Logits
étoient
-0.61
Infór
-0.59
Italij
-0.57
recargable
-0.56
plegable
-0.56
enfans
-0.55
hendes
-0.55
født
-0.54
виправивши
-0.54
:✨
-0.54
POSITIVE LOGITS
Af
1.13
Af
1.02
af
0.95
AF
0.94
af
0.93
AF
0.85
afl
0.62
AFR
0.61
afs
0.59
Afr
0.57
Activations Density 0.424%