INDEX
Explanations
instances of punctuation marks
negative exclamations and intensifiers
New Auto-Interp
Negative Logits
providing
-0.63
provides
-0.59
provide
-0.55
highly
-0.53
utilizing
-0.51
אשר
-0.50
presented
-0.49
enabling
-0.48
primarily
-0.48
subsequent
-0.48
POSITIVE LOGITS
mierda
0.65
shitty
0.65
jeito
0.64
sowas
0.63
fuckin
0.62
crappy
0.61
houſe
0.59
goddamn
0.59
fuck
0.57
fuck
0.57
Activations Density 0.166%