INDEX
Explanations
references to popular movies and books
occurrences of the word "the."
New Auto-Interp
Negative Logits
lington
-0.69
alas
-0.67
ward
-0.67
INA
-0.64
ilde
-0.63
imaru
-0.63
fully
-0.63
BER
-0.62
="#
-0.61
ARA
-0.60
POSITIVE LOGITS
apocalypse
0.84
Ancients
0.84
Apocalypse
0.78
nutshell
0.77
millennium
0.76
Nile
0.73
Confederacy
0.73
beast
0.72
Alps
0.71
Gods
0.71
Activations Density 0.194%