INDEX
Explanations
proper nouns, particularly names of people
references to specific individuals or characters associated with media content, particularly focusing on the word "itter."
New Auto-Interp
Negative Logits
cephal
-0.70
Kingdoms
-0.67
urat
-0.62
routed
-0.62
arteries
-0.60
ilitation
-0.58
lez
-0.58
translate
-0.57
looting
-0.57
Ward
-0.56
POSITIVE LOGITS
geist
1.10
bug
0.99
sburg
0.95
cair
0.87
mite
0.83
iver
0.79
bugs
0.79
eer
0.78
tainment
0.77
idge
0.77
Activations Density 0.028%