INDEX
Explanations
phrases that highlight absurd or humorous narratives and unconventional scenarios
New Auto-Interp
Negative Logits
disegni
-0.58
appunto
-0.56
ParallelGroup
-0.55
Almost
-0.52
soldati
-0.52
fermée
-0.52
becauſe
-0.51
samarbeid
-0.51
eszköz
-0.51
onCancelled
-0.51
POSITIVE LOGITS
rinol
0.67
hemor
0.65
friger
0.62
∮
0.61
twimg
0.60
phyla
0.59
Thong
0.59
orghini
0.59
Encyclopedia
0.58
ghijklmnop
0.58
Activations Density 0.515%