INDEX
Explanations
repetitive expressions of agreement or acknowledgement
New Auto-Interp
Negative Logits
tic
-0.75
nin
-0.71
c
-0.71
ni
-0.70
len
-0.69
nik
-0.67
ity
-0.65
ment
-0.65
se
-0.65
ls
-0.64
POSITIVE LOGITS
also
1.29
ALSO
1.26
ALSO
1.25
кож
1.22
Também
1.21
gså
1.16
וגם
1.12
also
1.12
wnież
1.11
También
1.10
Activations Density 0.158%