INDEX
Explanations
initials of names, specifically those that are repeated with high activations such as 'JC', 'JM', 'JJ', 'JD', and 'Jindal'
references to specific individuals or prominent names
New Auto-Interp
Negative Logits
uve
-0.72
holders
-0.71
rights
-0.68
dayName
-0.64
leton
-0.64
disperse
-0.64
otide
-0.63
worth
-0.62
arest
-0.61
mates
-0.61
POSITIVE LOGITS
ealous
0.98
ordan
0.89
JC
0.83
ihad
0.82
upiter
0.82
ournals
0.80
unal
0.80
JA
0.78
JJ
0.77
ournal
0.77
Activations Density 0.033%