INDEX
Explanations
references to the "Star Wars" franchise and its associated content
New Auto-Interp
Negative Logits
/XML
-0.18
.XML
-0.17
çĵ
-0.16
rob
-0.16
ATRIX
-0.16
oulos
-0.15
rud
-0.15
Nicar
-0.15
Colombian
-0.15
ignon
-0.15
POSITIVE LOGITS
Rey
0.39
Ky
0.38
Finn
0.33
Sno
0.32
Poe
0.31
Ky
0.30
Ren
0.29
BB
0.28
Leia
0.28
Maz
0.27
Activations Density 0.012%