INDEX
Explanations
details related to movies and cultural references
New Auto-Interp
Negative Logits
gencies
-0.91
teness
-0.86
acity
-0.82
fw
-0.80
apo
-0.79
alde
-0.78
ffect
-0.77
erity
-0.74
parency
-0.72
arers
-0.72
POSITIVE LOGITS
classic
0.87
bestselling
0.86
copyrighted
0.84
Akira
0.80
Herman
0.79
Gothic
0.79
classics
0.77
medieval
0.76
popular
0.75
screenplay
0.75
Activations Density 0.135%