INDEX
Explanations
proper nouns related to various characters and personalities
specific references to films and their associated characteristics
New Auto-Interp
Negative Logits
guiName
-1.03
etheless
-0.98
glim
-0.77
tyr
-0.66
toget
-0.65
lyak
-0.62
âķIJâķIJ
-0.60
ometimes
-0.60
discrimination
-0.60
opian
-0.60
POSITIVE LOGITS
)
1.84
)'
1.75
)"
1.72
),
1.68
?)
1.58
):
1.56
)",
1.53
)|
1.52
),"
1.50
).
1.49
Activations Density 0.480%