INDEX
Explanations
tokens that indicate special formatting or markers in the text
New Auto-Interp
Negative Logits
jalá
-0.66
becauſe
-0.56
Eſ
-0.54
inexp
-0.53
iſt
-0.53
enormously
-0.52
niedersachsen
-0.52
daardoor
-0.52
Viewed
-0.52
obſ
-0.51
POSITIVE LOGITS
tasty
0.73
beginnetje
0.72
bling
0.71
groovy
0.71
furry
0.69
sweet
0.69
shenanigans
0.68
festive
0.68
frosty
0.67
juicy
0.66
Activations Density 0.353%