INDEX
Explanations
references to story plots or plans of action
references to plots or schemes in narratives
New Auto-Interp
Negative Logits
IDA
-0.71
agles
-0.69
angelo
-0.69
Downloadha
-0.66
ertodd
-0.64
ILA
-0.62
Jazz
-0.62
Scot
-0.62
Gi
-0.62
Lak
-0.61
POSITIVE LOGITS
ters
0.88
plot
0.86
Plot
0.84
twists
0.83
lines
0.82
line
0.79
plotting
0.77
ories
0.77
cases
0.76
tle
0.76
Activations Density 0.017%