INDEX
Explanations
proper nouns, specifically related to individuals like public figures
repeated mentions of individual names, particularly "Kagan" and "Ogan"
New Auto-Interp
Negative Logits
upon
-0.75
mutual
-0.68
Score
-0.67
İĭ
-0.67
luck
-0.63
backlog
-0.63
âĢ¢âĢ¢
-0.63
apprehension
-0.62
abdom
-0.62
predictive
-0.62
POSITIVE LOGITS
igans
1.04
agan
1.04
za
1.03
atin
0.96
arde
0.96
zeb
0.96
lein
0.92
ocene
0.91
azi
0.88
itating
0.87
Activations Density 0.007%