INDEX
Explanations
references to the character Castiel and related terms within the context of a narrative
New Auto-Interp
Negative Logits
er
-0.21
suppress
-0.16
jak
-0.15
erap
-0.15
idence
-0.14
itional
-0.14
/UIKit
-0.14
Aceptar
-0.14
ái
-0.14
침
-0.14
POSITIVE LOGITS
iron
0.27
.Cast
0.26
cast
0.25
Cast
0.25
Iron
0.24
ellan
0.23
.cast
0.22
les
0.22
legate
0.21
iron
0.20
Activations Density 0.009%