INDEX
Explanations
the name "Morelli"
repeated use of the name "Lilli"
New Auto-Interp
Negative Logits
rooms
-0.75
mal
-0.74
LESS
-0.72
ifiers
-0.67
Alive
-0.66
grounds
-0.66
lines
-0.66
binary
-0.65
Corpus
-0.65
rush
-0.65
POSITIVE LOGITS
otti
1.12
quist
0.94
lli
0.92
zzi
0.92
zzle
0.90
zzo
0.90
oyd
0.89
oti
0.89
ptin
0.88
orno
0.87
Activations Density 0.006%