INDEX
Explanations
references to fictional characters or settings, especially in relation to gaming and storytelling
New Auto-Interp
Negative Logits
paddle
-0.20
padd
-0.17
station
-0.16
dn
-0.16
Station
-0.15
llx
-0.15
stations
-0.14
Station
-0.14
Stations
-0.14
tle
-0.14
POSITIVE LOGITS
Witch
0.32
Ger
0.28
witch
0.27
Ger
0.26
ger
0.23
witch
0.23
CD
0.23
CD
0.18
aska
0.18
Henry
0.18
Activations Density 0.002%