INDEX
Explanations
instances of the word "Am"
New Auto-Interp
Negative Logits
eum
-0.16
onet
-0.16
enko
-0.16
pas
-0.15
ean
-0.15
éĽĦ
-0.15
point
-0.14
pa
-0.14
points
-0.14
onen
-0.14
POSITIVE LOGITS
plitude
0.28
igos
0.27
ended
0.26
ateurs
0.25
alg
0.25
ateur
0.24
sterdam
0.24
iens
0.24
usement
0.23
igo
0.22
Activations Density 0.016%