INDEX
Explanations
sentence punctuation and quotation marks
New Auto-Interp
Negative Logits
Media
-0.49
Stor
-0.48
media
-0.48
vam
-0.46
media
-0.45
swear
-0.45
chio
-0.44
Middel
-0.44
\\
-0.44
minmax
-0.42
POSITIVE LOGITS
IsMutable
0.78
RTGC
0.76
'
0.72
"
0.66
gewohnt
0.65
انيف
0.64
„
0.62
Datuak
0.62
følgelig
0.62
ويكيميديا
0.61
Activations Density 0.330%