INDEX
Explanations
names of characters and locations from a specific fictional series
character names from a specific TV series
New Auto-Interp
Negative Logits
itud
-0.78
trak
-0.75
score
-0.73
isconsin
-0.72
EPA
-0.69
iba
-0.68
lli
-0.66
itudes
-0.64
oard
-0.63
izu
-0.63
POSITIVE LOGITS
Lann
1.60
Bolton
1.18
Stark
1.12
Thrones
1.10
Wester
1.08
Targ
1.05
Robb
0.92
Archdemon
0.85
compr
0.84
tyr
0.82
Activations Density 0.027%