INDEX
Explanations
references to characters and their roles in films
New Auto-Interp
Negative Logits
colon
-0.16
adele
-0.16
ICON
-0.16
benh
-0.15
agan
-0.15
borderTop
-0.15
anou
-0.15
ãĥ¼ãĤ¿ãĥ¼
-0.14
eld
-0.14
Sher
-0.14
POSITIVE LOGITS
Mate
0.29
mate
0.28
mates
0.26
Ratings
0.25
ratings
0.25
Mate
0.25
ratings
0.25
Deck
0.23
Rating
0.23
rating
0.23
Activations Density 0.044%